presentation assessment - graduation proposal

pre

needed to take a step back, and look at my proposal in a wider context
to be able to formulate my critical position

intro

https://youtu.be/FjibavNwOUI?t=2m11s

these are the results of a text mining process, created by researchers of the WWBP.

the researcher feels the need to excuse himself for the results he presents,

"he didn't make this up". he suggests to not have been involved in the process to come to these results.

why does he do this?

why does he excuse himself?

why does he make a joke about it, while presenting the results as absolute truth?

context / previous practise

1. Cqrrelations

introduction to text-mining

2. V2 exhibition - Encyclopedia of Media Object

cataloging the object from the show in the structure of a trainingset for image recognition

→ how these categories reveal in which way the algorithm is trained to 'interpret' our world

3. relearn

continuation of cqrrelations, looking at technical text mining process

technical --> cultural focus
culture around text mining --> fundaments created --> influence on how you can look at results
culture of 'algorithm agreeability'

algorithmic agreeability

1. no human responsibility

results are the 'direct data'
absolute truth --> in a cloud (cloud = changing all the time, now results are fixed)
most common words in the middle, bit like a heart of an entity
made me think of 'death of the author', death of the text-miner

2. antropomorphism

antropomorphic terms are used to describe the processes
which might create the wrong expectations from these algorithms

direct access to the data, without any form of mediation
the data is true, because it is out there

3. assumptions are re-used

from text mining discourse
academic context, phd
the assumptions from where these projects start (profiling people + sentiment = pos/neg)

4. geen linguist, maar data --> autonoom

frederick jelinek, IBM 80s --> speech rec. software
because the linguistic information was richer IN THE DATA
no need for a linguistic interpretation
believe in the data

i think this leads then to

→ data autonomy

data

autonomous entity
we accept how it is
because it is there

→ what do these results mean????

what happens at 40/45???
do these results confirm your 'common sense'?
how can we have an opinion about this?

these four points

1. no human responsibility
2. anthropomorphism
3. assumptions are re-used
4. data = direct resource, no need for linguistic interpretation

show how the culture around text mining frames how we interpret text-mining results
anthropomorhism --> mystifies the process
data = framed as autonomous entity, because it is out there

but

if we look at the technical process, 3 aspects

data documents
humans
software

a text mining process consists out of human descisions

big influence on the outcomes

but the algorithm agreeability frames that the human descisions are not involved

so: we have direct access to the data, without any mediating layer

but already the human involvement is a form of mediation:

what are the features to mine for
selecting the data to work with
simplifying written text to data
comparing results with the expectations
if not satisfied --> through try and error redo the whole process

the 4 cultural points of agreeability make it possible

to regard the text mining process as a kind of data-religion

it starts to show similarities, such as

human decisions are presented as absolute truth
patterns in a set of abstract meaningless data are explained

--> when you pray every day, things will go well
--> when you keep on 'mining', you will get the results you expect

to conclude

i feel skeptical about text-mining processes

correlating → meaning to patterns in language usages --> raises many questions for me
and the human involvement is presented as absolute truths
feels like a data-religion, because of the strong sense of algorithmic agreeability

grad. project

how to work from these worries?

--> i think it can be useful to reveal this cultural agreeability
--> it shows how we relate to data & automated processes

practical next steps:
- choosing a case study

for example the World Well Being Project

analyse cultural elements they use
- visual language they use to show results
- vocabulary use, definitions
- context → in what field active?

i can imagine a collection of these elements

i am thinking about making a a publication, or publication series

maybe not only about text-mining, but it could expand later to other field of natural language processing

prototypes

list of departments where text-mining falls under
ocr vs. text mining

'information' (thesaurus & wordnet)
Joseph Weizenbaum questions

User:Manetta/graduation-proposals/presentation-assessment-trimester-4

Contents