User:Angeliki/Exhibition ttssr

Link of the project: https://pzwiki.wdka.nl/mediadesign/Ttssr-Speech_Recognition_Iterations

                                               ttssr-loop-human-only

My collection of texts From Tedious Tasks to Liberating Orality: Practices of the Excluded on Sharing Knowledge, refers to orality in relation to programming, as a way of sharing knowledge including our individually embodied position and voice. The emphasis on the role of personal positioning is often supported by feminist theorists. Similarly, and in contrast to scanning, reading out loud is a way of distributing knowledge in a shared space with other people, and this is the core principle behind the ttssr-> Reading and speech recognition in loop software. Using speech recognition software and python scripts I propose to the audience to participate in a system that highlights how each voice bears the personal story of an individual. In this case the involvement of a machine provides another layer of reflection of the reading process.

The story behind

As in oral cultures, in contrast to the literate cultures, I am getting engaged with two important ingredients for knowledge production; the presence of a speaker and the oral speech. The oral narratives are based on the previous ones keeping a movable line to the past by adjusting to the history of the performer, but only if they are important and enjoyable for the present audience and time. The structure of the oral poem is based on rhythmic formulas, the memory and verbal interaction. Oral cultures exist without the need of writing, texts and dictionaries. They don't need a library to be stored, to look up and create their texts. The learning process is shared from individual positions but with the participation of the community. Both my reader and my software highlight also another aspect of knowledge production, from literate cultures, regarding repetition and text processing tasks. From weaving to typewriting and programming, women, mainly hidden from the public, were exploring the realm of writing beyond its conventional form. According to Kittler (1999, pg. 221) “A desexualized writing profession, distant from any authorship, only empowers the domain of text processing. That is why so many novels written by recent women writers are endless feedback loops making secretaries into writers”. But aren’t these endless feedback loops similar to the rhythmic narratives of the anonymous oral cultures? How this knowledge is produced through repetitive formulas that are easily memorized? In the context of the present available technologies, like speech recognition software, and in relation with other projects, using the technology of their time, these questions can be explored on a more practical base. With the aid and the errors of Pocketsphinx, I aim to create new oral and reading experiences. My work derives from the examples of two other projects coming from 80s: Boomerang (1974) and I Am Sitting In A Room (1981). The first one is about forming a tape by recording and broadcasting continuously the artist speaking. The latter is exploiting the imperfections of the system of recording tape machines and grabs the room echoes as musical qualities in the tape-delay system. It is noticeable that repetition seems to become an interesting instrument in the two projects and the process of typewriting. In all cases the machine includes repetition as its basic element (eternal loops and repeated processes). My software transcribes the voice of people reading a text. This process can be related to the typists transcribing the speech of writers, replacing the contemporary speech recognition software. Focusing on this anonymous labour, worked together with a machine, one cannot forget the anonymous people, who worked for the Pocketsphinx models. The training of this software is based on recorded human voices reading words. The acoustic model, being created after, is structured with phonemes. I am reversing this procedure by giving back Pocketsphinx to human voices.

Input text

Any one is one having been that one Any one is such a one.

Any one having been that one is one remembering something oi such a thing, is one
remembering having been that one.

Each one having been one is being one having been that one. Each one haying been

one is remembering something of this thing, is remembering something or haying been
that one

Each one is one. Each one has been one. Each one being one, each one havrng been

one is remembering something or that thing.

Necessary Equipment

for one oral poet:

1 laptop
2 sets of headphones with microphone
1 USB audio interface

for >1 oral poets:

1 laptop
1 loudspeaker
1 microphone
1 set of headphones
1 USB audio interface

Set-up

Print this page and put it in the installation

Software installation

In Linux/Mac

README.md

This project is part of the OuNuPo project of the Experimental Publishing course in Master of Media Design in Piet Zwart Institute. https://issue.xpub.nl/05/ https://git.xpub.nl/OuNuPo-make/ https://xpub.nl/

This is a different and separated version of my own proposal within the project.

Author

Angeliki Diakrousi

Description

Reading and speech recognition feedback loops using the first sentence of a text

output.txt

According to Kittler (1999, pg. 221) “A desexualized writing profession, distant from any authorship, only empowers the domain of text processing. That is why so many novels written by recent women writers are endless feedback loops making secretaries into writers”. My choice for input to this software is an extract of the text 'Many Many Women' of Gertrude Stein. The input can be any text of your choice.

References

Kittler, F.A., (1999) Typewriter, in: Winthrop-Young, G., Wutz, M. (Trans.), Gramophone, Film, Typewriter. Stanford University Press, Stanford, Calif, pp. 214–221 Stein, G., 2005. Many Many Women, in: Matisse Picasso and Gertrude Stein With Two Shorter Stories

Install Dependencies

PocketSphinx package sudo aptitude install pocketsphinx pocketsphinx-en-us

PocketSphinx Python library: sudo pip3 install PocketSphinx

Try this: pocketsphinx_continuous

If you find this error error while loading shared libraries: libpocketsphinx.so.3: cannot open shared object file: No such file or directory do this sudo nano /etc/ld.so.conf and add:

include usr/local/lib

include usr/

Other software packages:sudo apt-get install gcc automake autoconf libtool bison swig python-dev libpulse-dev

Speech Recognition Python library: sudo pip3 install SpeechRecognition

TermColor Python library: sudo pip3 install termcolor

PyAudio Python library: sudo pip3 install pyaudio

Clone Repository

git clone https://gitlab.com/nglk/ttssr.git

Make command

Sitting inside a pocket(sphinx) Speech recognition feedback loops using the first sentence of a scanned text as input

run: cd ttssr make ttssr-human-only

Licenses:

2018 WTFPL – Do What the Fuck You Want to Public License. 2018 BSD 3-Clause – Berkeley Software Distribution

Scripts

Makefile

ttssr-human-only: ocr/output.txt ## Loop: text to speech-speech recognition. Dependencies: espeak, pocketsphinx
	bash src/ttssr-loop-human-only.sh ocr/output.txt

ttssr-loop-human-only.sh

#!/bin/bash
i=0;
echo "Read every new sentence out loud!"
head -n 1 $1 > output/input0.txt
while [[ $i -le 10 ]]
	do echo $i
	cat output/input$i.txt 
	python3 src/ttssr_write_audio.py src/sound$i.wav 2> /dev/null
	play src/sound$i.wav repeat 5 2> /dev/null & #in the background the sound, without it all the sounds play one by one//2 is stderr
	python3 src/ttssr_transcribe.py sound$i.wav > output/input$((i+1)).txt 2> /dev/null
	sleep 1
	(( i++ ))
done
today=$(date +%Y%m%d.%H-%M);
mkdir -p "output/ttssr.$today"
mv -v output/input* output/ttssr.$today;
mv -v src/sound* output/ttssr.$today;



 #            DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
 #                    Version 2, December 2004

 # Copyright (C) 2018 Angeliki Diakrousi <diakaggel@gmail.com>

 # Everyone is permitted to copy and distribute verbatim or modified
 # copies of this license document, and changing it is allowed as long
 # as the name is changed.

 #            DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
 #   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

 #  0. You just DO WHAT THE FUCK YOU WANT TO.

ttssr_transcribe.py

#!/usr/bin/env python3

import speech_recognition as sr
import sys
from termcolor import cprint, colored
from os import path
import random

a1 = sys.argv[1] 
AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), a1) 


# use the audio file as the audio source
r = sr.Recognizer()
with sr.AudioFile(AUDIO_FILE) as source:
    audio = r.record(source) 

color = ["white", "yellow"]
on_color = ["on_red", "on_magenta", "on_blue", "on_grey"]

# recognize speech using Sphinx
try:
    cprint( r.recognize_sphinx(audio), random.choice(color), random.choice(on_color))
except sr.UnknownValueError:
    print("uknown")
except sr.RequestError as e:
    print("Sphinx error; {0}".format(e))


# Copyright (c) 2018, Angeliki Diakrousi <diakaggel@gmail.com>
# Copyright (c) 2014-2017, Anthony Zhang <azhang9@gmail.com>
# All rights reserved.

# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:

# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.

# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.

# 3. Neither the name of the copyright holder nor the names of its contributors
# may be used to endorse or promote products derived from this software without
# specific prior written permission.

# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

src/ttssr_write_audio.py

#!/usr/bin/env python3
# NOTE: this example requires PyAudio because it uses the Microphone class

import speech_recognition as sr
import sys
from time import sleep

a1 = sys.argv[1]

# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
    # print("Read every new sentence out loud!")
    audio = r.listen(source)

# write audio to a WAV file
with open(a1, "wb") as f:
	f.write(audio.get_wav_data())



# Copyright (c) 2018, Angeliki Diakrousi <diakaggel@gmail.com>
# Copyright (c) 2014-2017, Anthony Zhang <azhang9@gmail.com>
# All rights reserved.

# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:

# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.

# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.

# 3. Neither the name of the copyright holder nor the names of its contributors
# may be used to endorse or promote products derived from this software without
# specific prior written permission.

# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.