User:Angeliki/Grad project loop

Human always in the communication loop

My title has derived from the term Human-in-the-loop (HITL) that “is a branch of artificial intelligence that leverages both human and machine intelligence to create machine learning models” (Anadiotis, no date).

What do I want to make?

I want to make a series of experiments involving participants that will imitate activities happening parallelly to our communication systems. One possible outcome of this would be an online database with all these practices and material that will be produced. It could work as a toolkit for trying these experiments in the future. At the end I aim to direct a final action based on that database.
The diagram of Shannon Claude of 1948 presents how a general communication system works. The message is passed through a transmitter that encodes it and transfers it through a one way channel to the receiver. The latter decodes the message and the person from the other side listens to or reads it. This way of communication, that the technology of the previous century created, brought a new relationship with our bodies. One of the most intense experiences of it was the detachment of our voice from our physical bodies. Our voice being mediated through different kinds of media gets distorted, sounds unfamiliar and gets disconnected from us.

Nowadays, the new data practices of big corporations like Google and Facebook are intervening into this communication system, listening to, transcribing, documenting our voices and messages. It is commonly unknown that the most important part of making and training an intelligent ‘assistant’ like GPS voice and Alexa, that is based on speech recognition tool, is the contribution of huge data of human voices. The samples used from the companies/organisations come from different sources and communication systems around the world [imitations of conversations, radio broadcasts, telephone conversations, field recordings, online readings] sometimes with the permission of the people donating their voice. At the same time these tools (like the automatic dialect analysis) are used from the state of Germany to verify the claims of origin of refugees. It is very often that they can get wrong because "Identifying the region of origin for anyone based on their speech is an extremely complex task" and depends on "a wide range of factors"(Sputnik, no date). Google also is using voice recordings of the users from other google apps to train its speech recognition tool.
Refugee interviewed
So, based on the contemporary context, I annotated the diagram with possible interventions of the corporations of today. Even though the outcome of data practices is a machine/product, a huge amount of human activities are related in order to train it, like transcribing, recording and speaking. The one way channel becomes a loop, where the user communicating is also ‘input’ for the machine that will provide her/his tool of communication. We are more than users using various technical mediums, we are part of the loop.
As Zizek observes, it is very difficult for the common person to understand how algorithms work “but we can easily understand how we are controlled by the digital grid” (The Economist, 2018). I want to make visible the human presence in the loop by understanding it and share it with in a physical space in real time.

How do I plan to make it?

I will make visible this condition (of human always on the loop) through a collection of material related to trained data of speech recognition tools^[1] [voice samples, ways of collection of samples] and practices that re-enact the loop with human activities involved in it. More specifically, these practices will be related to the human processes of the loop, like the presence of human annotators [transcription], listening and identifying voices, decoding and encoding, donating voice samples but also other elements like the delay in the circuit and the spatial or hardware qualities that distort the voice. Below I present some examples I am already involved in that can be elements for supporting my approach:
Here are some examples of voice samples that I found from an organisation involved for the training of a speech recognition tools [pocketsphinx]:

(microphone conversation)

Interview 15
(A=Interviewer; B=Interviewee)
A: So we are recording.  Awesome.  So how long have you lived in Flint? (unclear)
B: 38 years.
A: Is that your whole life?  Wow you look really young.
B: Thank you!(...)

(telephone conversation/ giving directions on spot while walking)

And this is an example of a description accompanied these voice samples that shows the way that a voice sample was made: ...The number of interviewees in a single program varies from one to three, but typically, one interviewer and two interviewees appear in the program. The material includes passages of interactive dialogue, but longer stretches of monologue-like speech comprise the majority of the collected data...
To start grasping and controlling a reality that we are already in [human-in-the-loop] we should also understand the processes happening in relation to our bodies. Exercises like the deep listening sessions of Pauline Oliveros can be a daily experimentation where I can find a place between our bodies and the technology of mediation. They are about sharing and following collectively instructions that we get connected with the inner functions of the body, like circulation, electromagnetism and vibrations, by reading, listening and moving. My personal experience of it was that I became a medium by repeating a video with instructions.

What is my timetable?

Until the end of January: Conducting interviews with people related in collecting voice data or engaged with and re-appropriating communication technologies. I will participate in workshops or lectures related to these topics throughout the year, like CCC (Chaos Communication Congress). I will research on recordings/databases and ways of collecting voice samples for training. At the same time I will experiment with mediation in private and public spaces. I will also do more experiments on collective reading, listening, repeating, walking. I will document observations and audio, video, photos, text produced.
26/11: in my pyratechnic session with Alice I will prepare a loop process in which the students will try different actions related to the loop in my diagram.
30/11: I will try the same in Leeszaal, where people from different countries and backgrounds visit.
27-30/11: CCC
February: I will relate the dynamics of the previous experiments and I will proceed with more involving more people in the process. Invite them to share their personal relation to these experiments [how they perceive the mediation of their voice and their presence in the loop] and use this material in the process. Merge my process with facts related to the people involved (where is that process important or not).
March-April: Considering the documentation and outcome of my processes I will re-think my next steps and select part of the practices that have more impact.
May-June: Wrap up

Why do I want to make it?

I believe that all these procedures happening on a daily base with or without our knowledge. They have an affect on our physical bodies that we unconsciously overpass. I want to create an experiential model of collective sharing of our voices based on the system I am researching on and described earlier. My purpose is to strengthen our awareness for our involvement in the loop and answer in some questions that bother me. I want to investigate these questions with the groups of people that will be involved in the process. Some of these questions is how our bodies are influenced by these systems, what is the control over them, what are the new relation with our voices and how can we appropriate these new technologies more consciously?
Our communication and our relation with these systems have been mainly formed by big corporations or the states and their character is mostly male technical or military. It seems to me that the communication platforms are estranged* realities difficult to understand. The context (social, cultural background, gender) that defines an individual is disappearing or manipulated into this massive automated process happening throughout our communication systems. Our data and bodies the same. I am interested especially in voice because it is a personal and unique element. For oral cultures the voice was a medium to spread knowledge, on a way that differs a lot from the writing cultures, "When auditory experiences are shared, histories too are shared, and not only from mouth to ear: they are perceived by and encoded in the body through the physicality of sound waves and passed on from one generation to another."(Public Radio - documenta 14, no date).
Technology becomes an extension of this desire to reach the invisible and distant, something beyond the limitations of your own body. But when I talk about detachment I don’t mean it in a negative sense; there is first an alienation and frustration and distancing, but if we understand it with our body we understand this communication, so it is a way of understanding media through simple techniques as, for instance, just ‘repeating’ a youtube video or transcribing the voice of our interlocutor.

Who can help me and how?

People I interview like Reni Hofmüller can help me on experimenting with the specific technology of mediation and imagining other aspects of them. Also, Raadio Caargo can help me imagining potential futures [feminist futurotopias as they call it] of the mediums by engaging with different related methodologies and practices. Joana, a former student can help me with prototyping, references and discussion on embodied and distant voice. My tutors Amy and Clara with deep listening exercises.

Previously

I move towards this direction the last years. I worked with collective writing, reading, speech recognition, collective annotation and collective reading. It is very often in my work that I am interested in the parallel presence through the voice and the tools that relate the embodied and the distant voice.

Relation to a larger context

In the old times telephone operators, mostly women, were working underpaid in a stressful position. They were invisible and their voices were representing a specific feminine character that was promoted by their companies, serving their clients. I think one contemporary version of this labour is the contribution of people in databases for training a machine. Though the conditions are better and different, the morals around it are under consideration. "(S)hould we be worried about the large-scale harvesting of our voiceprints? (…) The companies behind this technology say that a voiceprint includes more than 100 unique physical and behavioural characteristics of each individual " (Jones, 2018). At the other side of this machines is the effect they have on decisions regarding people’s lives and access in spaces. For example, the verification of claims of origins of refugees I previously referred to. Also, the bias spread by the old telephone companies still exist. The feminine voice representing a polite servant has been used many times by the artificial intelligent machines. Examples are the GPS navigator, Amazon’s personal assistant Alexa and the one of Apple, Siri.
There are several attempts from feminists, artists and programmers of approaching hacking, technological cultures from a more feminist approach that involves the body and the vulnerabilities of the individual. One example is the collective Hacking With Care that include practices of taking care of their body while working as hackers. Another one is the work r∆∆dio c∆∆rgo and Spideralex that reclaim their relationships with technologies by engaging with different collective methodologies and practices. At my point of view these attempts aim to subvert the dominant uses and relations with the technology and propose other ways of communicating.

Bibliography

The Economist (2018) ‘Are liberals and populists just searching for a new master?’, 8 October. Available at: https://www.economist.com/open-future/2018/10/08/are-liberals-and-populists-just-searching-for-a-new-master (Accessed: 25 October 2018).

Sputnik (no date) Germany to Use Dialect Recognition Software to Verify Origins of Refugees. Available at: https://sputniknews.com/europe/201703181051711403-germany-software-dialects-refugees/ (Accessed: 11 November 2018).

Jones, R. (2018) ‘Voice recognition: is it really as secure as it sounds?’, The Guardian, 22 September. Available at: https://www.theguardian.com/money/2018/sep/22/voice-recognition-is-it-really-as-secure-as-it-sounds (Accessed: 11 November 2018).

Public Radio - documenta 14 (no date). Available at: https://www.documenta14.de/en/public-radio/ (Accessed: 7 November 2018).

Notes

↑ https://catalog.ldc.upenn.edu/search

[1] ttps://catalog.ldc.upenn.edu/search

[1]