User:Dave Young/Prototyping 2
"Soldier, Soldier"
During last week's "Soldier, Soldier" marathon playlist that Michael put together, I was struck by how the words were occasionally swapped around a little bit, or a here and there a word might be replaced by another entirely. It being a folk song, it makes perfect sense: folk music is (or was at least) an oral tradition - meaning that songs would be passed down through the generations by word of mouth rather than "solid" artefacts such as manuscripts or sheet music. The result is an incredibly slow game of Chinese whispers, with the message suffering interpretive interference with each retelling. Curious about the abilities of Wordnet's abilities to replicate this effect, I wrote the below python script. In brief, it breaks the text down into individual words, and replaces each word for a random synonym from Wordnet's library.
# Simulating oral traditions in folk music with python+wordnet
# Prototyping 2
#
# Depends: nltk > http://nltk.sourceforge.net/
# import external libs
from nltk.corpus import wordnet
import string
import random
from random import choice
soldier = open("soldier.txt")
# filter out some words
ignoreStrings = ["you", "me", "a", "Oh", "I", "no", "to", "on", "As", "as", "was", "it"]
# set up arrays
for lines in soldier:
line = lines.split(" ")
line = map(string.strip, line) # removes newline '\n' character
newLine = []
for word in line:
synsets = wordnet.synsets(word)
allSyns = [] # this list contains every synonym for 'word' from wordnet
syns = [] # this is an edited list of synonyms, each one only appears once
# make a list of synonyms in "synsets"
for synonym in synsets:
allSyns.append(synonym.lemma_names)
# if 'word' is on the ignore list, skip synonym checks
if word in ignoreStrings != 0:
syns = [word]
else:
# check that the synonym list isn't empty
if(len(allSyns) != 0):
# result is a single list containing all synonyms
result = sum(allSyns, [])
# remove elements that appear more than once
for syn in result:
if syn in allSyns != -1:
print ""
else:
syns.append(syn)
else:
syns = [word]
# make a random selection
newWord = choice(syns)
newLine.append(newWord)
theLine = ' '.join(newLine)
print theLine