User:Angeliki/2nd Trimester: Difference between revisions

From XPUB & Lens-Based wiki
 
(147 intermediate revisions by the same user not shown)
Line 1: Line 1:
<div style='  
<div style='  
width: 80%;
width: 80%;
Line 9: Line 8:
   background-color: #FCFC03;
   background-color: #FCFC03;
'>
'>


== OCR ==
== OCR ==
=== Tesseract training: ===
=== Tesseract training ===
[[Install Tesseract 4.0-Ubuntu|1. Install Tesseract]]<br />
[[Install Tesseract 4.0-Ubuntu|1. Install Tesseract]]<br />
2. Recipe for training
2. Recipe for training


== Reading- Writing ==
== Reading- Writing ==
[[Synopsis_24-1-2018#Reading-_Angeliki|Synopsis]]
[[Synopsis_24-1-2018#Reading-_Angeliki|Synopsis]]<br />
[[Essay21Feb|Comparative essay]]


== Reader ==
== Reader ==
[https://pzwiki.wdka.nl/mw-mediadesign/images/7/71/XPUB_reader_concept.pdf Mini reader]
[https://pzwiki.wdka.nl/mw-mediadesign/images/7/71/XPUB_reader_concept.pdf Mini reader]


== Python scripts ==
[[Reader#6/Angeliki|Reader#6<br />
Python whisperer
 
<syntaxhighlight lang="python" line='line'>
import nltk
import collections
import random
import sys
 
from sys import stdin, stderr, stdout
 
o = open("Synopsis_24012018.txt", 'r')
original = o.read()
tokens = nltk.word_tokenize(original)
for noun in tokens:
noun = noun.lower()
# print (tokens)
v = open("nouns/91K nouns.txt")
nouns = v.read()
tokens_nouns = nltk.word_tokenize(nouns)
# print (tokens_nouns)
newnouns = []
for word in tokens:
if word in tokens_nouns:
n=tokens_nouns.index(word)
# # print (n)
newnouns.append(tokens_nouns[n])
# # print (newnouns)
 
filename = 'Audiosfera-2015-Westerkamp.txt'
vocabulary = []
 
vocabulary_size = 1000
def read_input_text(filename):
    txtfile = open(filename, 'r')
    string = txtfile.read()
    words = nltk.word_tokenize(string)
    # print (words)
    for word in words:
    word=word.lower()
    vocabulary.append(word)
    # print('Data size:', len(vocabulary))               
 
read_input_text(filename)
# print(vocabulary)


newsynopsis = []
<big>''~~ From Tedious Tasks to Liberating Orality ~~''</big>]
for word in vocabulary:
if word in tokens_nouns:
newsynopsis.append(random.choice(newnouns))
else:
newsynopsis.append(word)
print (" ".join(newsynopsis))


Cover made with Graphviz


== [[Angeliki/PROTOTYPING 2|Python scripts]] ==


</syntaxhighlight>
== The secrets of pocketsphinx ==
=== Acoustic model/training ===
what

Latest revision as of 18:52, 24 March 2018


OCR

Tesseract training

1. Install Tesseract
2. Recipe for training

Reading- Writing

Synopsis
Comparative essay

Reader

Mini reader

Reader#6
~~ From Tedious Tasks to Liberating Orality ~~

Cover made with Graphviz

Python scripts

The secrets of pocketsphinx

Acoustic model/training

what