Revision as of 18:31, 6 January 2016

outline

intro

NLP

With 'i-could-have-written-that' i would like to look at technologies that process natural language (NLP). By regarding NLP software as cultural objects, i'll focus on the inner workings of their technologies: how do they systemize our natural language?

what is NLP?

NLP is a category of software packages that is concerned with the interaction between human language and machine language. NLP is mainly present in the field of computer science, artificial intelligence and computational linguistics.

where is it used in the wild?

NLP is part of translation engines, search engines, speech recognition, auto-correction, chatbots, OCR (optical character recognition), license plate detection, text-mining: using the content on the Web to construct meaningful datasets, ...;

why is it important to speak about it?

Placing the computer in a central position in my project, not only as technology but also as a cultural object, makes it possible to reveal in which way NLP software is constructed to understand human language, and what side-effects they have.

knowledge discovery in data (data-mining)

For the occassion of graduating this year, i would like to look at data-mining.

what is data-mining, text-mining?

Data-mining is the information-processing technology that is part of the knowledge discovery in data process. In many cases the Web is used as an information resource. By training algorithms to recognize patterns in these large set of information, data is constructed. This data is regarded as valuable, and used (or sometimes sold) for advertisements and descision making processes, where data-mining results are used as argumentation.

where is it used in the wild?
what (for me) is problematic with data-mining?
what effects does that have?
(how could that be improved?)

hypothesis

The results of data-mining software are not mined, results are constructed.
What elements do allow for algorithmic agreeability?

project

voice: accessible for a wider public

problem formulations:

terminology ('mining', 'data')
text-processing
- from: able to check results with senses (OCR), to: intuition (data-mining) [what are the differences?]
- parsing, how text is treated: as n-grams, chunks, bag-of-words, characters
use of wordclouds
- data as autonomous entity; from: information, to: data science [what are the differences?]

algorithmic agreeability case study objects (from the wild)

terminology & anthropomorphism: data 'mining' (wiki-page)
terminology & anthropomorphism: 'machine learning'
terminology: 'data'
wordclouds

thesis

voice: more technical? + theoretical

theory

solutionism & techno optimism

algorithmic agreeability case study objects (field-specific)

workflow mining-software (eg. Pattern, Wecka)
- software workflow diagram
- the use of mathematical graphs & dimensions

research material

→ filesystem interface, collecting research related material (+ about the workflow)
→ wikipage for 'i-could-have-written-that' (list of prototypes & inquiries)
→ little glossary

mining as ideology

* from mining minerals to mining data

anthropomorphism

* anthropomorphic qualities of a computer (?)
* the photographic apparatus → the data apparatus (annotations)
* Joseph's (Weizenbaum) questions on Computer Power and Human Reason

annotations

Alan Turing - Computing Machinery and Intelligence (1936)
The Journal of Typographic Research - OCR-B: A Standardized Character for Optical Recognition this article (V1N2) (1967); → abstract
Ted Nelson - Computer Lib & Dream Machines (1974);
Joseph Weizenbaum - Computer Power and Human Reason (1976); → annotations
Water J. Ong - Orality and Literacy (1982);
Vilem Flusser - Towards a Philosophy of Photography (1983); → annotations
Christiane Fellbaum - WordNet, an Electronic Lexical Database (1998);
Charles Petzold - Code, the hidden languages and inner structures of computer hardware and software (2000); → annotations
John Hopcroft, Rajeev Motwani, Jeffrey Ullman - Introduction to Automata Theory, Languages, and Computation (2001);
James Gleick - The Information, a History, a Theory, a Flood (2008); → annotations
Matthew Fuller - Software Studies. A lexicon (2008);
- Language, Florian Cramer; → annotations
- Algorithm, Andrew Goffey;
Marissa Meyer - the physics of data, lecture (2009); → annotations
Matthew Fuller & Andrew Goffey - Evil Media (2012); → annotations
Antoinette Rouvroy - All Watched Over By Algorithms - Transmediale (Jan. 2015); → annotations
Benjamin Bratton - Outing A.I., Beyond the Turing test (Feb. 2015) → annotations
Ramon Amaro - Colossal Data and Black Futures, lecture (Okt. 2015); → annotations
Benjamin Bratton - On A.I. and Cities : Platform Design, Algorithmic Perception, and Urban Geopolitics (Nov. 2015);

bibliography (five key texts)

Language, Florian Cramer (2008); → annotations
Antoinette Rouvroy - All Watched Over By Algorithms - Transmediale (Jan. 2015); → annotations
The Journal of Typographic Research - OCR-B: A Standardized Character for Optical Recognition this article (V1N2) (1967); → abstract

@@ Line 1: / Line 1: @@
 <div style="width:100%;max-width:800px;">
 =outline=
+== intro==
+===NLP===
+With 'i-could-have-written-that' i would like to look at technologies that process natural language (NLP). By regarding NLP software as cultural objects, i'll focus on the inner workings of their technologies: how do they systemize our natural language?
-With 'i-could-have-written-that' i would like to look at technologies that process natural language (NLP). By regarding NLP software as cultural objects, i'll focus on the inner workings of their technologies: how do they systemize our natural language? For the occassion of graduating this year, i would like to look at data-mining, text-mining and machine learning, the technologies that are used to gain information from large amounts of data by recognizing patterns.
+* what is NLP?
+NLP is a category of software packages that is concerned with the interaction between human language and machine language. NLP is mainly present in the field of computer science, artificial intelligence and computational linguistics.
+* where is it used in the wild?
+NLP is part of translation engines, search engines, speech recognition, auto-correction, chatbots, OCR (optical character recognition), license plate detection, text-mining: using the content on the Web to construct meaningful datasets, ...;
+* why is it important to speak about it?
+Placing the computer in a central position in my project, not only as technology but also as a cultural object, makes it possible to reveal in which way NLP software is constructed to understand human language, and what side-effects they have.
+===knowledge discovery in data (data-mining)===
+For the occassion of graduating this year, i would like to look at data-mining.
+* what is data-mining, text-mining?
+Data-mining is the information-processing technology that is part of the ''knowledge discovery in data'' process. In many cases the Web is used as an information resource. By training algorithms to recognize patterns in these large set of information, data is constructed. This data is regarded as valuable, and used (or sometimes sold) for advertisements and descision making processes, where data-mining results are used as argumentation.
+* where is it used in the wild?
+* what (for me) is problematic with data-mining?
-=== intro===
+* what effects does that have?
-* NLP, natural language processing
+* (how could that be improved?)
-* current focus: data-mining field (a data-fashion)
 ==hypothesis==

User:Manetta/thesis/thesis-outline: Difference between revisions