graduation proposal +0.2

title: "i could have written that"

alternatives:

typographic technologies / typographic systems
turning words into numbers

Introduction

For in those realms machines are made to behave in wondrous ways, often sufficient to dazzle even the most experienced observer. But once a particular program is unmasked, once its inner workings are explained in language sufficiently plain to induice understanding, its magic crumbles away; it stands revealed as a mere collection of procedures, each quite comprehensible. The observer says to himself "I could have written that". With that thought he moves the program in question from the shelf marked "intelligent" to that reserved for curios, fit to be discussed only with people less enlightened that he. (Joseph Weizenbaum, 1966)

what do you want to do?

to set up a publishing platform to reveal inner workings of technologies that systemize natural language — e.g.: data/text-mining (text-parsing, text-simplification, vector-space-models, algorithmic recognition analysises, algorithmic culture), machine learning (training-sets, taxonomies, categories), logic (simplification, universal representative system); through tools that function as natural language interfaces (for both the human and machines), and could be regarded as (contemporary) typographic systems.

/

to set up a publishing platform to reveal the way these reading-writing systems touch the issues of systemization / automation / an algorithmic 'truth' that contain simplification / probability / modeling processes ...

... by looking closely at the material (technical) elements that are used to construct certain systems ... (in order to look for alternative perspectives)

(a first prototype to start collecting related material is online here: http://pzwart1.wdka.hro.nl/~manetta/i-could-have-written-that/)

Relation to a larger context

"i could have written that" could be seen as a reaction on an article published in 'The Journal of Typographic Research' (V1N2-1967), in which the typeface OCR-B was enthusiastically presented. OCR-B is designed by Adrian Frutiger and — as the name already implies — optimized for machinic reading. It was a reaction to OCR-A, which had similar intention, but OCR-B is been developed to also be 'aesthetically accepted by the human eye'. The author ends the article with stating the 'hope that one day "reading machines" will have reached perfection and will be able to distinguish without any error the symbols of our alphabets, in whatever style they may be written.'

current or former (related) magazines :

other publishing platforms :

Monoskop
unfold.thevolumeproject.org
mailinglist interface: lurk.org
mailinglist interface: nettime --> discussions in public

Relation to previous practice

In the last year, i've been looking at different tools that contain linguistic systems. From speech-to-text software to text-mining tools, they all systemize language in various ways in order to understand natural language—as human language is called in computer science. These tools fall under the term 'Natural Language Processing' (NLP), which is a field of computer science that is closely related to Artificial Intelligence (AI).

As a continutation of that i took part at the Relearn summerschool in Brussels last August, to propose a working track in collaboration with Femke Snelting on the subject of 'training common sense'. With a group of people we have been trying to deconstruct the truth-construction process in algorithmic cultures, by looking at data mining processes, deconstructing the mathematical models that are used, finding moments where semantics are mixed with mathematic models, and understanding which cultural context is created around this field. These steps where in close relation to a text-mining software package called 'Pattern'. The workshop during Relearn transformed into a project that we called '#!Pattern+, and will be strongly collaborative and ongoing over a longer time span. #!Pattern+ will be a critical fork of the latest version of Pattern, including reflections and notes on the software and the culture it is surrounded within. The README file that has been written for #!PATTERN+ is online here, and more information is collected on this wiki page.

Another entrance to understanding what happens in algorithmic practises such as machine learning, is by looking at training sets that are used to train software that is able to recognize certain patterns in a set of data. These training sets could contain a large set of images, texts, 3d models, or video's. By looking at such datasets, and more specifically at the choises that have been made in terms of structure and hierarchy, steps of the construction a certain 'truth' are revealed. For the exhibition "Encyclopedia of Media Object" in V2 last June, i created a catalog, voice over and booklet, which placed the objects from the exhibition within the framework of the SUN database, a resource of images for image recognition purposes. (link to the "i-will-tell-you-everything (my truth is a constructed truth" interface)

There are a few datasets in the academic world that seem to be basic resources to built these training sets upon. In the field they are called 'knowledge bases'. They live on a more abstract level then the training sets do, as they try to create a 'knowlegde system' that could function as a universal structure. Examples are WordNet (a lexical dataset), ConceptNet, and OpenCyc (an ontology dataset). In the last months i've been looking into WordNet, worked on a WordNet Tour (still ongoing), and made an alternative browser interface (with cgi) for WordNet. It's all a process that is not yet transformed in an object/product, but untill now documented here and here on the Piet Zwart wiki.

Thesis intention

Practical steps

how?

writing/collecting from a technological point of departure, as has been done before by:

- Matthew Fuller, powerpoint (+ in 'software studies, a lexicon')
- Constant, pipelines
- Steve Rushton, feedback
- Angie Keefer, Octopus

touching the following issues around the systemization of language:

- automation (of tasks/human labour; algorithmic culture, machine learning, ...)
- simplification (as step in the process; turning text into number)
- the aim for an universal system (taxonomy structures, categorization, ascii/unicode, logic)
- it works? (revealing inner workings and non-workings of technologies)
- cultural context (algorithmic agree-ability, believe in technology, AI, aim for invisibility / naturalization)

while using open-source software, in order to be able to have a conversation with the tools that will be discussed, open them up.

sort of 'mission statements'

As a magazine / blog / or newsletter is a periodical format that evolves over time, it captures and reflects a certain time and location including the present ideals and concerns. Setting up such publishing platform is partly also comming from an archival aim. Looking back now at the issues of Radical Software published in the 1970s for example, it gives me an urge to capture the concerns of today about (algorithmic) technologies (data monopolies, a strong believe in algorithms and a objectivication of mathematics for example). By departing from a very technical point of view, i hope to develop a stage for alternative perspectives on these issues (making 'it-just-not-works' tutorials for example). But also by regarding machine-learning processes as typography, and therefore as reading/writing machines or processes, it could create a playground where i cannot only collect information about such systems, but also put them into practise.

It would be great if technology would be as visible as possible again, opened up, and deconstructed, at a time where the invisibility of technique is key, and computers or phones are 'just' working. These ideals come from a certain set of cultural principles present in the field of open-source: take for example the importance of distribution in stead of centralization, the aim of making information available for everyone (in the sense that it should not only be available but also legible), and the openness of the software packages which makes it possible to dig into the files a piece of software uses to function.

As coming from a background in graphic design, being closer to techniques and the open source principles bring up a whole set of new design questions. For example: How can an interface reveal its inner system? How can structural descisions be design actions?

questions of research

if natural language systems could be regarded as typography, which reading/writing options does that bring?
how to built and maintain a (collaborative) publishing project?
- technically: what kind of system to use to collect? wiki? mailinglist interface?
- what kind of system to use to publish?
- publishing: online + print --> inter-relation
- in what context ?

References

datasets

* WordNet (Princeton)
* ConceptNet 5 (MIT Media)
* OpenCyc

people

algorithmic culture

Luciana Parisi
Matteo Pasquinelli
Antoinette Roivoy
Seda Gurses

other

Software Studies. A lexicon. by Matthew Fuller (2008)

reading list

notes and related projects

BAK lecture: Matthew Fuller, on the discourse of the powerpoint (Jun. 2015) - annotations

project: Wordnet

project: i will tell you everything (my truth is a constructed truth)

project: serving simulations

User:Manetta/graduation-proposals/proposal-0.2

Contents