User:Manetta/thesis/chapter-intro: Difference between revisions

From XPUB & Lens-Based wiki
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
<div style="width:750px;">
<div style="width:750px;">
__TOC__
__TOC__
=i could have written that - intro=
=intro=


== text analytics < > systemization of language ==
==hypothesis==
The results of text mining software are not 'mined', results are constructed.
 
== text mining as writing technique (structure) ==
 
'''chapter 1 - raw language'''
 
the non-man paradox
text as data
parsing excercise
- split (tokenize)
- count (bag-of-words)
- tag (part-of-speech, POS)
the non-text?
the non-text paradox, no context
levels of rawness
ideals of rawness
 
'''chapter 2 - various approaches - 3 case studies'''


This text originates from an interest in the systemization of language that is needed for computer software to be able to 'understand' and process written language.
manager (economy PhD candidate)
* using raw data to make decisions


The aim of this text is to perceive text analytics & statistical computing from another angle: to see, sense, feel and somehow understand what 'gestures' are applied to written words to make them more 'readable' for a computer software/program.
magician (psychologist)
* using the rawness of data as a smoke screen, making use of common sense, clichés and assumptions


as typography does for the human eye so to say.
archaeologist (comp. linguist)
* using the rawness of the words as material to work with, to carefully derive information from, by following different standards and procedures


i try to formulate a lot of questions that arose while working with the software, listening to presentations or video's and reading about the technology in academic papers, books and online articles.
'''chapter 3 - from 'mining' to KDD'''


with these questions i hope to give an insight in the particular way that these techniques are looking at language, in which fields they are applied, and with what ideologies they seem to be embraced.
examples of the use of the term 'mining' in popular articles!
KDD 1989 version, initial people that coined the term: elements of subjectivity + loops involved
(KDD 2013 version)


questions that hopefully lift up some layers that cover the techniques, to take a sneak peak into their strength and persuasiveness.
+ parts of Pattern's close reading could maybe illustrate some of the KDD steps in more detail


==hypothesis==
'''conclusion'''
The results of text mining software are not 'mined', results are constructed.


the practice of mining is dirty, messy and contains many gray areas that are tweaked until the results match certain preset expectations.


=links=
=links=
Line 30: Line 53:
[[User:Manetta/thesis/chapter-2 | chapter 2]]
[[User:Manetta/thesis/chapter-2 | chapter 2]]


[[User:Manetta/thesis/chapter-3 | chapter 3]]
</div>
</div>

Latest revision as of 15:10, 30 April 2016

intro

hypothesis

The results of text mining software are not 'mined', results are constructed.

text mining as writing technique (structure)

chapter 1 - raw language

the non-man paradox text as data parsing excercise - split (tokenize) - count (bag-of-words) - tag (part-of-speech, POS) the non-text? the non-text paradox, no context levels of rawness ideals of rawness

chapter 2 - various approaches - 3 case studies

manager (economy PhD candidate)

  • using raw data to make decisions

magician (psychologist)

  • using the rawness of data as a smoke screen, making use of common sense, clichés and assumptions

archaeologist (comp. linguist)

  • using the rawness of the words as material to work with, to carefully derive information from, by following different standards and procedures

chapter 3 - from 'mining' to KDD

examples of the use of the term 'mining' in popular articles! KDD 1989 version, initial people that coined the term: elements of subjectivity + loops involved (KDD 2013 version)

+ parts of Pattern's close reading could maybe illustrate some of the KDD steps in more detail

conclusion

the practice of mining is dirty, messy and contains many gray areas that are tweaked until the results match certain preset expectations.

links

thesis in progress (overview)

intro &+

chapter 1

chapter 2

chapter 3