User:Manetta/scripts/pocketsphinx-to-text

From XPUB & Lens-Based wiki

input example: about HAL

pocketsphinx output

HAL (2001 - a Space Odyssey, 1968):

000000000: and i ask
000000001: don't mind told
000000002: i hope the to view and concerned about drugs
000000003: quite sure so i know everything we house is the site right 
let's say us at a shore you know it's very confident face that it's going
to see a hole right to get us i feel much for to the status
to where you got very loose from various to stay

Samantha (Her, 2013):

000000000: we don't think when you me because you get into work but 
you when i'm of what is not leave the coming when the suffering
so you minus of character on most who she i will
to another u. two is that a risky had her grief for that a you 000000001: tony everything that a to do to have he thinks he has ago
if your home and his family and to come in to you and
can do you say will don't worry about his talk you an chairman who
will as she i and i think to are good for its new home you're
not an on it moving to a there 000000002: this you do if you he 000000003: through the sea

Echo (Amazon Echo, 2015):

000000000: when it we're not time to for what right

Gerty (Moon, 2009):

000000000: fed is the chief
000000001: i think the that a u. upset by industry or not for that she was 
i out the side the law firm is a to also allows them to are used to and
you are send the sand what is it might help to talk about
a hit says that going hungry are you hungry and you have
changed and so the senate and i don't it so
if you knew is going to be

Ash (Black Mirror, 2013):

000000000: or so on alexander
000000001: hello to call wall street the at it a cry at the break the speech today 
he night on know hear stories i mean i'm not three 000000002: sorry i only to see if you ask that no matter is all right but it lot
have been on the that data on room always been a new trial to hit
half ago in this in somebody who like a u. s. a. of the out that you all right 000000003: this is years 000000004: the shore of it in our out to a tax cut 000000005: and you won't e. c. go where we going very high and married has brought to see though law and if you the the way us


code: video to wav to text

import os
import glob
import subprocess as sub


os.chdir("./")
for file in glob.glob("*.mp4"):
	x = file
	video = x
	name = x.replace(".mp4", "")
	outputname = name.replace("answ-quest-", "")
	print '****WORKING ON******', name

	cmd_ffmpeg = 'ffmpeg -i '+video+' -vn -acodec pcm_s16le -ac 1 -ar 16000 -y '+name+'.wav'
	print cmd_ffmpeg
	os.system(cmd_ffmpeg)

	with open('pocketsphinx-'+outputname+'.txt', 'w') as txt:

		i = 0
		y = ['00000', 'recognized']
		for i in range (1):
			cmd = 'pocketsphinx_continuous -infile '+name+'.wav | grep '+y[0]+''
			print cmd

			output = os.popen(cmd,'rb')
			while 1:
				line = output.readline()
				if not line: break
				print line
				txt.write(line), "\n"


to do future steps

Looking closer into pocketsphinx, look at the process of word recognition, how to optimize results?

Displaying the proces of recognition, that generates an image of all the words it has been recognized but rejected as being right. Though, this a bit feels as an living organism who's thinking. This script has been shown at Cqrrelations (january 2015), by RYBN.

As an example:

Mb-voice-I-STTtypo-page001.png


Mb-voice-I-STTtypo-page001.png