User:Angeliki/Exhibition ttssr
Link of the project: https://pzwiki.wdka.nl/mediadesign/Ttssr-Speech_Recognition_Iterations
ttssr-loop-human-only
My collection of texts From Tedious Tasks to Liberating Orality: Practices of the Excluded on Sharing Knowledge, refers to orality in relation to programming, as a way of sharing knowledge including our individually embodied position and voice. The emphasis on the role of personal positioning is often supported by feminist theorists. Similarly, and in contrast to scanning, reading out loud is a way of distributing knowledge in a shared space with other people, and this is the core principle behind the ttssr-> Reading and speech recognition in loop software. Using speech recognition software and python scripts I propose to the audience to participate in a system that highlights how each voice bears the personal story of an individual. In this case the involvement of a machine provides another layer of reflection of the reading process.
The story behind
As in oral cultures, in contrast to the literate cultures, I am getting engaged with two important ingredients for knowledge production; the presence of a speaker and the oral speech. The oral narratives are based on the previous ones keeping a movable line to the past by adjusting to the history of the performer, but only if they are important and enjoyable for the present audience and time. The structure of the oral poem is based on rhythmic formulas, the memory and verbal interaction. Oral cultures exist without the need of writing, texts and dictionaries. They don't need a library to be stored, to look up and create their texts. The learning process is shared from individual positions but with the participation of the community. Both my reader and my software highlight also another aspect of knowledge production, from literate cultures, regarding repetition and text processing tasks. From weaving to typewriting and programming, women, mainly hidden from the public, were exploring the realm of writing beyond its conventional form. According to Kittler (1999, pg. 221) “A desexualized writing profession, distant from any authorship, only empowers the domain of text processing. That is why so many novels written by recent women writers are endless feedback loops making secretaries into writers”. But aren’t these endless feedback loops similar to the rhythmic narratives of the anonymous oral cultures? How this knowledge is produced through repetitive formulas that are easily memorized? In the context of the present available technologies, like speech recognition software, and in relation with other projects, using the technology of their time, these questions can be explored on a more practical base. With the aid and the errors of Pocketsphinx, I aim to create new oral and reading experiences. My work derives from the examples of two other projects coming from 80s: Boomerang (1974) and I Am Sitting In A Room (1981). The first one is about forming a tape by recording and broadcasting continuously the artist speaking. The latter is exploiting the imperfections of the system of recording tape machines and grabs the room echoes as musical qualities in the tape-delay system. It is noticeable that repetition seems to become an interesting instrument in the two projects and the process of typewriting. In all cases the machine includes repetition as its basic element (eternal loops and repeated processes). My software transcribes the voice of people reading a text. This process can be related to the typists transcribing the speech of writers, replacing the contemporary speech recognition software. Focusing on this anonymous labour, worked together with a machine, one cannot forget the anonymous people, who worked for the Pocketsphinx models. The training of this software is based on recorded human voices reading words. The acoustic model, being created after, is structured with phonemes. I am reversing this procedure by giving back Pocketsphinx to human voices.
Input text
Any one is one having been that one Any one is such a one.
Any one having been that one is one remembering something oi such a thing, is one
remembering having been that one.
Each one having been one is being one having been that one. Each one haying been
one is remembering something of this thing, is remembering something or haying been
that one
Each one is one. Each one has been one. Each one being one, each one havrng been
one is remembering something or that thing.
Necessary Equipment
for one oral poet:
- 1 laptop
- 2 sets of headphones with microphone
- 1 USB audio interface
for >1 oral poets:
- 1 laptop
- 1 loudspeaker
- 1 microphone
- 1 set of headphones
- 1 USB audio interface
Set-up
Print this page and put it in the installation
Software installation
In Linux/Mac
README.md
This project is part of the OuNuPo project of the Experimental Publishing course in Master of Media Design in Piet Zwart Institute. https://issue.xpub.nl/05/ https://git.xpub.nl/OuNuPo-make/ https://xpub.nl/
This is a different and separated version of my own proposal within the project.
Author
Angeliki Diakrousi
Description
Reading and speech recognition feedback loops using the first sentence of a text
output.txt
According to Kittler (1999, pg. 221) “A desexualized writing profession, distant from any authorship, only empowers the domain of text processing. That is why so many novels written by recent women writers are endless feedback loops making secretaries into writers”. My choice for input to this software is an extract of the text 'Many Many Women' of Gertrude Stein. The input can be any text of your choice.
References
Kittler, F.A., (1999) Typewriter, in: Winthrop-Young, G., Wutz, M. (Trans.), Gramophone, Film, Typewriter. Stanford University Press, Stanford, Calif, pp. 214–221 Stein, G., 2005. Many Many Women, in: Matisse Picasso and Gertrude Stein With Two Shorter Stories
Install Dependencies
PocketSphinx package sudo aptitude install pocketsphinx pocketsphinx-en-us
PocketSphinx Python library: sudo pip3 install PocketSphinx
- Try this: pocketsphinx_continuous
- If you find this error error while loading shared libraries: libpocketsphinx.so.3: cannot open shared object file: No such file or directory do this sudo nano /etc/ld.so.conf and add:
- include usr/local/lib
- include usr/
Other software packages:sudo apt-get install gcc automake autoconf libtool bison swig python-dev libpulse-dev
Speech Recognition Python library: sudo pip3 install SpeechRecognition
TermColor Python library: sudo pip3 install termcolor
PyAudio Python library: sudo pip3 install pyaudio
Clone Repository
git clone https://gitlab.com/nglk/ttssr.git
Make command
Sitting inside a pocket(sphinx) Speech recognition feedback loops using the first sentence of a scanned text as input
run: cd ttssr make ttssr-human-only
Licenses:
2018 WTFPL – Do What the Fuck You Want to Public License. 2018 BSD 3-Clause – Berkeley Software Distribution
Scripts
Makefile
ttssr-human-only: ocr/output.txt ## Loop: text to speech-speech recognition. Dependencies: espeak, pocketsphinx
bash src/ttssr-loop-human-only.sh ocr/output.txt
ttssr-loop-human-only.sh
#!/bin/bash
i=0;
echo "Read every new sentence out loud!"
head -n 1 $1 > output/input0.txt
while [[ $i -le 10 ]]
do echo $i
cat output/input$i.txt
python3 src/ttssr_write_audio.py src/sound$i.wav 2> /dev/null
play src/sound$i.wav repeat 5 2> /dev/null & #in the background the sound, without it all the sounds play one by one//2 is stderr
python3 src/ttssr_transcribe.py sound$i.wav > output/input$((i+1)).txt 2> /dev/null
sleep 1
(( i++ ))
done
today=$(date +%Y%m%d.%H-%M);
mkdir -p "output/ttssr.$today"
mv -v output/input* output/ttssr.$today;
mv -v src/sound* output/ttssr.$today;
# DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
# Version 2, December 2004
# Copyright (C) 2018 Angeliki Diakrousi <diakaggel@gmail.com>
# Everyone is permitted to copy and distribute verbatim or modified
# copies of this license document, and changing it is allowed as long
# as the name is changed.
# DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
# TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
# 0. You just DO WHAT THE FUCK YOU WANT TO.
ttssr_transcribe.py
#!/usr/bin/env python3
import speech_recognition as sr
import sys
from termcolor import cprint, colored
from os import path
import random
a1 = sys.argv[1]
AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), a1)
# use the audio file as the audio source
r = sr.Recognizer()
with sr.AudioFile(AUDIO_FILE) as source:
audio = r.record(source)
color = ["white", "yellow"]
on_color = ["on_red", "on_magenta", "on_blue", "on_grey"]
# recognize speech using Sphinx
try:
cprint( r.recognize_sphinx(audio), random.choice(color), random.choice(on_color))
except sr.UnknownValueError:
print("uknown")
except sr.RequestError as e:
print("Sphinx error; {0}".format(e))
# Copyright (c) 2018, Angeliki Diakrousi <diakaggel@gmail.com>
# Copyright (c) 2014-2017, Anthony Zhang <azhang9@gmail.com>
# All rights reserved.
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# 3. Neither the name of the copyright holder nor the names of its contributors
# may be used to endorse or promote products derived from this software without
# specific prior written permission.
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
src/ttssr_write_audio.py
#!/usr/bin/env python3
# NOTE: this example requires PyAudio because it uses the Microphone class
import speech_recognition as sr
import sys
from time import sleep
a1 = sys.argv[1]
# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
# print("Read every new sentence out loud!")
audio = r.listen(source)
# write audio to a WAV file
with open(a1, "wb") as f:
f.write(audio.get_wav_data())
# Copyright (c) 2018, Angeliki Diakrousi <diakaggel@gmail.com>
# Copyright (c) 2014-2017, Anthony Zhang <azhang9@gmail.com>
# All rights reserved.
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
# 1. Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
# 2. Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
# 3. Neither the name of the copyright holder nor the names of its contributors
# may be used to endorse or promote products derived from this software without
# specific prior written permission.
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.