User:Fako Berkers/project2

Sniff, Scrape, Crawl

WikiAPI

I have had a look at Wikipedia and I'm interested in categories especially when they include people. You have for instance a category of Marxist Theorist (to stay a little bit in the same genre as last trimester). This page lists all people categorized as Marxist Theorist and nothing else.

I find categories exiting whenever I regard them as communities. The persons listed there may not even be aware of this community, but as a fact some common ideal or subject or whatever binds these persons together.

I would like to sniff, scrape and crawl in a number of ways to reveal these communities to themselves and others. The following possibilities occurred to me when viewing the Wiki API

try to fetch jargon used by a community (or their wiki users/pages)
try different kinds of mapping like (most quoted, highest rank by Google, most backlinks, voted most important by own community, voted most important by critics)
fetch total bibliography of community and make up sorting algorithms
create a "fieldview" by relating the communities of critics to the community being portrait
try a community kickstart by putting email addresses associated with the names on a mailinglist

In the long run small aps like these might build up to article validation. For instance if a text called text.A contains jargon from community.13 then a computer could see to whom described ideas belong to and how these are regarded by other communities (critiques) and the rest of the world (popularity measured through Google ranking)

Article validation may be useful to counter information overload, but I do think that users should always be able to favor certain writers manually. This is to make sure that people choose to ignore or favor certain writing instead of a computer telling people what to read because most people read that.