User:Laurier Rochon/prototyping/pythov
< User:Laurier Rochon
Revision as of 19:10, 16 May 2011 by Laurier Rochon (talk | contribs)
Building Markov chains with simple sentences. Will reproduce the same amount of sentences, always using the first word of each one.
Approach 1 : using strings/substring
Conclusion : A bit more dense, and harder to read, but works well. Gets sloppy in the multi-sentence thing.
He saw the cat before he knew it. He saw the cat before he saw the potato. I was sure.
He saw the cat before he saw the potato. He saw the cat before he knew it. I was sure.
He saw the cat before he saw the potato. He saw the cat before he knew it. I was sure.
He saw the cat before he knew it. He knew it. I was sure.
He knew it. He saw the cat before he knew it. I was sure.
He saw the potato. He knew it. I was sure.
He knew it. He saw the potato. I was sure.
He saw the potato. He saw the potato. I was sure.
He saw the cat before he saw the potato. He knew it. I was sure.
He knew it. He saw the potato. I was sure.
import random, re
for a in range(0,10):
text = 'He saw the cat before he saw the potato.He knew it.I was sure.'
text = text.replace('.',' . ')
f = ''
nbsents = text.count('.')
nbchars = 0
for a in range(0,nbsents):
start = nbchars
end = text.find(" ",nbchars+1)
dot = text.find(" . ",nbchars+1)
nbchars = dot+2
chosen = text[start:end].strip(' \t\n\r')
f = f+text[start:end]+ " "
while chosen!='.':
searchstr = "\\b%s\\b" % chosen
a = re.compile(searchstr,re.IGNORECASE);
nextwords = []
for m in a.finditer(text):
nextwordpos = text.find(" ",m.end()+1)
nextwords.append(text[m.end()+1:nextwordpos])
chosen = nextwords[random.randrange(0,len(nextwords))]
f = f+chosen
if chosen != '.':
f = f+' '
f = f.replace(' .','.')
print f
Approach 2 : dictionaries/lists
Conclusion : more light-weight, a bit more modular and much easier for multi-sentence
He saw the cat before he saw the cat before he saw the cat before he saw the cat before he knew it. He knew it. I was sure.
He saw the potato. He saw the cat before he saw the cat before he knew it. I was sure.
He saw the cat before he knew it. He saw the potato. I was sure.
He saw the cat before he knew it. He saw the potato. I was sure.
He knew it. He knew it. I was sure.
He saw the potato. He saw the potato. I was sure.
He knew it. He saw the potato. I was sure.
He knew it. He saw the cat before he saw the cat before he saw the cat before he saw the cat before he knew it. I was sure.
He knew it. He saw the cat before he saw the cat before he saw the potato. I was sure.
He saw the cat before he saw the potato. He saw the potato. I was sure.
import random
for b in range(0,10):
text = 'He saw the cat before he saw the potato.He knew it.I was sure.'
text = text.replace('.',' . ').lower()
words = text.split()
d = {}
c = 0
f = ''
sents = text.split('.')
for a in range(0,len(sents)-1):
for w in words:
if c< len(words)-1:
if words[c] not in d:
d[w] = []
d[w].append(words[c+1])
c=c+1
allw = sents[a].strip(' \t\n\r').split()
chosen = allw[0]
f = f + str(chosen).capitalize()+' '
while chosen != '.':
new = d[chosen][random.randrange(0,len(d[chosen]))]
f = f +str(new)+' '
chosen = new
f = f.replace(' .','.')
print f