User:Manetta/graduation-proposals/presentation-assessment-trimester-4

From XPUB & Lens-Based wiki

presentation assessment - graduation proposal

pre

  • needed to take a step back, and look at my proposal in a wider context
  • to be able to formulate my critical position


intro

https://youtu.be/FjibavNwOUI?t=2m11s

these are the results of a text mining process, created by researchers of the WWBP.

the researcher feels the need to excuse himself for the results he presents,

"he didn't make this up". he suggests to not have been involved in the process to come to these results.

why does he do this?

why does he excuse himself?

why does he make a joke about it, while presenting the results as absolute truth?


context / previous practise

1. Cqrrelations

  • introduction to text-mining


2. V2 exhibition - Encyclopedia of Media Object

  • cataloging the object from the show in the structure of a trainingset for image recognition

→ how these categories reveal in which way the algorithm is trained to 'interpret' our world


3. relearn

  • continuation of cqrrelations, looking at technical text mining process
  • technical --> cultural focus
  • culture around text mining --> fundaments created --> influence on how you can look at results
  • culture of 'algorithm agreeability'


algorithmic agreeability

World-Well-Being-Project-wordclouds.gif


1. no human responsibility

  • results are the 'direct data'
  • absolute truth --> in a cloud (cloud = changing all the time, now results are fixed)
  • most common words in the middle, bit like a heart of an entity
  • made me think of 'death of the author', death of the text-miner


Antropomorphic-reading-terms.gif


2. antropomorphism

  • antropomorphic terms are used to describe the processes
  • which might create the wrong expectations from these algorithms


Meta-metaphors data-mining.gif


  • direct access to the data, without any form of mediation
  • the data is true, because it is out there


EUR-PhD-defence-sentiment-mining.JPG


3. assumptions are re-used

  • from text mining discourse
  • academic context, phd
  • the assumptions from where these projects start (profiling people + sentiment = pos/neg)


Whenever-i-fire-a-linguist.gif


4. geen linguist, maar data --> autonoom

  • frederick jelinek, IBM 80s --> speech rec. software
  • because the linguistic information was richer IN THE DATA
  • no need for a linguistic interpretation
  • believe in the data

i think this leads then to

→ data autonomy

data

  • autonomous entity
  • we accept how it is
  • because it is there


World-well-being-project words-across-age loop.gif


→ what do these results mean????

  • what happens at 40/45???
  • do these results confirm your 'common sense'?
  • how can we have an opinion about this?


these four points

1. no human responsibility
2. anthropomorphism
3. assumptions are re-used
4. data = direct resource, no need for linguistic interpretation
  • show how the culture around text mining frames how we interpret text-mining results
  • anthropomorhism --> mystifies the process
  • data = framed as autonomous entity, because it is out there


but

Text-mining-technical-process.png


if we look at the technical process, 3 aspects

  • data documents
  • humans
  • software




a text mining process consists out of human descisions

big influence on the outcomes

but the algorithm agreeability frames that the human descisions are not involved

so: we have direct access to the data, without any mediating layer

but already the human involvement is a form of mediation:

  • what are the features to mine for
  • selecting the data to work with
  • simplifying written text to data
  • comparing results with the expectations
  • if not satisfied --> through try and error redo the whole process




the 4 cultural points of agreeability make it possible

to regard the text mining process as a kind of data-religion

it starts to show similarities, such as

  • human decisions are presented as absolute truth
  • patterns in a set of abstract meaningless data are explained
--> when you pray every day, things will go well
--> when you keep on 'mining', you will get the results you expect

to conclude

i feel skeptical about text-mining processes

  • correlating → meaning to patterns in language usages --> raises many questions for me
  • and the human involvement is presented as absolute truths
  • feels like a data-religion, because of the strong sense of algorithmic agreeability


grad. project

how to work from these worries?

--> i think it can be useful to reveal this cultural agreeability
--> it shows how we relate to data & automated processes
  • practical next steps:
    • choosing a case study


Wwbp.png


for example the World Well Being Project

  • analyse cultural elements they use
    • visual language they use to show results
    • vocabulary use, definitions
    • context → in what field active?

i can imagine a collection of these elements

i am thinking about making a a publication, or publication series

maybe not only about text-mining, but it could expand later to other field of natural language processing

prototypes

  • list of departments where text-mining falls under
  • ocr vs. text mining


Automated-reading-machines ocr-text-mining.png


  • 'information' (thesaurus & wordnet)
  • Joseph Weizenbaum questions