User:Laurier Rochon/prototyping/pythov: Difference between revisions
No edit summary |
No edit summary |
||
Line 8: | Line 8: | ||
<source lang="text"> | <source lang="text"> | ||
He saw the cat before he | He knew it. He knew it. I saw the cat before he saw the cat before he saw him. | ||
He saw the cat before he | He knew it. He knew it. I saw him. | ||
He saw the | He saw the cat before he knew it. He saw the cat before he knew it. I saw the potato. | ||
He saw him. He saw him. I saw the potato. | |||
He knew it. He saw the cat before he | He saw the potato. He saw the cat before he saw him. I saw the cat before he saw the potato. | ||
He saw the potato. He knew it. I | He knew it. He saw him. I saw the potato. | ||
He knew it. He | He knew it. He knew it. I saw the cat before he saw him. | ||
He saw the potato. He knew it. I saw him. | |||
He knew it. He knew it. I saw the cat before he saw the cat before he saw the potato. | |||
He knew it. | He knew it. He knew it. I saw him. | ||
</source> | </source> | ||
Line 24: | Line 24: | ||
for a in range(0,10): | for a in range(0,10): | ||
text = 'He saw the cat before he saw the potato.He knew it.I | text = 'He saw the cat before he saw the potato.He knew it.I saw him.' | ||
text = text.replace('.',' . ') | text = text.replace('.',' . ') | ||
f = '' | f = '' | ||
Line 57: | Line 57: | ||
<source lang="text"> | <source lang="text"> | ||
He saw | He knew it. He saw him. I saw the cat before he knew it. | ||
He saw | He saw him. He saw him. I saw him. | ||
He saw | He saw him. He saw the potato. I saw him. | ||
He saw the cat before he knew it. He | He saw the cat before he knew it. He knew it. I saw him. | ||
He | He saw him. He knew it. I saw him. | ||
He knew it. He knew it. I saw the cat before he saw the potato. | |||
He knew it. | He saw the cat before he saw him. He knew it. I saw him. | ||
He knew it. | He knew it. He knew it. I saw the potato. | ||
He knew it. He saw the | He saw the potato. He saw the potato. I saw him. | ||
He | He knew it. He saw the potato. I saw him. | ||
</source> | </source> | ||
Line 73: | Line 73: | ||
for b in range(0,10): | for b in range(0,10): | ||
text = 'He saw the cat before he saw the potato.He knew it.I | text = 'He saw the cat before he saw the potato.He knew it.I saw him.' | ||
text = text.replace('.',' . ').lower() | text = text.replace('.',' . ').lower() | ||
words = text.split() | words = text.split() |
Revision as of 19:13, 16 May 2011
Building Markov chains with simple sentences.
Will reproduce the same amount of sentences, always using the first word of each one.
Approach 1 : using strings/substring
Conclusion : A bit more dense, and harder to read, but works well. Gets sloppy in the multi-sentence thing (gotta remember where the last period was using counters, etc.).
He knew it. He knew it. I saw the cat before he saw the cat before he saw him.
He knew it. He knew it. I saw him.
He saw the cat before he knew it. He saw the cat before he knew it. I saw the potato.
He saw him. He saw him. I saw the potato.
He saw the potato. He saw the cat before he saw him. I saw the cat before he saw the potato.
He knew it. He saw him. I saw the potato.
He knew it. He knew it. I saw the cat before he saw him.
He saw the potato. He knew it. I saw him.
He knew it. He knew it. I saw the cat before he saw the cat before he saw the potato.
He knew it. He knew it. I saw him.
import random, re
for a in range(0,10):
text = 'He saw the cat before he saw the potato.He knew it.I saw him.'
text = text.replace('.',' . ')
f = ''
nbsents = text.count('.')
nbchars = 0
for a in range(0,nbsents):
start = nbchars
end = text.find(" ",nbchars+1)
dot = text.find(" . ",nbchars+1)
nbchars = dot+2
chosen = text[start:end].strip(' \t\n\r')
f = f+text[start:end]+ " "
while chosen!='.':
searchstr = "\\b%s\\b" % chosen
a = re.compile(searchstr,re.IGNORECASE);
nextwords = []
for m in a.finditer(text):
nextwordpos = text.find(" ",m.end()+1)
nextwords.append(text[m.end()+1:nextwordpos])
chosen = nextwords[random.randrange(0,len(nextwords))]
f = f+chosen
if chosen != '.':
f = f+' '
f = f.replace(' .','.')
print f
Approach 2 : dictionaries/lists
Conclusion : more light-weight, a bit more modular and much easier for multi-sentence
He knew it. He saw him. I saw the cat before he knew it.
He saw him. He saw him. I saw him.
He saw him. He saw the potato. I saw him.
He saw the cat before he knew it. He knew it. I saw him.
He saw him. He knew it. I saw him.
He knew it. He knew it. I saw the cat before he saw the potato.
He saw the cat before he saw him. He knew it. I saw him.
He knew it. He knew it. I saw the potato.
He saw the potato. He saw the potato. I saw him.
He knew it. He saw the potato. I saw him.
import random
for b in range(0,10):
text = 'He saw the cat before he saw the potato.He knew it.I saw him.'
text = text.replace('.',' . ').lower()
words = text.split()
d = {}
c = 0
f = ''
sents = text.split('.')
for a in range(0,len(sents)-1):
for w in words:
if c< len(words)-1:
if words[c] not in d:
d[w] = []
d[w].append(words[c+1])
c=c+1
allw = sents[a].strip(' \t\n\r').split()
chosen = allw[0]
f = f + str(chosen).capitalize()+' '
while chosen != '.':
new = d[chosen][random.randrange(0,len(d[chosen]))]
f = f +str(new)+' '
chosen = new
f = f.replace(' .','.')
print f