User:Dave Young/Prototyping 2

From XPUB & Lens-Based wiki
< User:Dave Young
Revision as of 21:44, 26 October 2011 by Dave Young (talk | contribs) (Created page with "=="Soldier, Soldier"== During last week's "Soldier, Soldier" marathon playlist that Michael put together, I was struck by how the words were occasionally swapped around a little...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

"Soldier, Soldier"

During last week's "Soldier, Soldier" marathon playlist that Michael put together, I was struck by how the words were occasionally swapped around a little bit, or a here and there a word might be replaced by another entirely. It being a folk song, it makes perfect sense: folk music is (or was at least) an oral tradition - meaning that songs would be passed down through the generations by word of mouth rather than "solid" artefacts such as manuscripts or sheet music. The result is an incredibly slow game of Chinese whispers, with the message suffering interpretive interference with each retelling. Curious about the abilities of Wordnet's abilities to replicate this effect, I wrote the below python script. In brief, it breaks the text down into individual words, and replaces each word for a random synonym from Wordnet's library.

# Simulating oral traditions in folk music with python+wordnet
# Prototyping 2
#   
# Depends: nltk > http://nltk.sourceforge.net/

# import external libs
from nltk.corpus import wordnet
import string
import random
from random import choice

soldier = open("soldier.txt")

# filter out some words
ignoreStrings = ["you", "me", "a", "Oh", "I", "no", "to", "on", "As", "as", "was", "it"]

# set up arrays
for lines in soldier:
    line = lines.split(" ")
    line = map(string.strip, line)  # removes newline '\n' character
    newLine = []

    for word in line:        
        synsets = wordnet.synsets(word)
        allSyns = []    # this list contains every synonym for 'word' from wordnet
        syns = []   # this is an edited list of synonyms, each one only appears once
        
        # make a list of synonyms in "synsets"
        for synonym in synsets:
            allSyns.append(synonym.lemma_names)
        
        # if 'word' is on the ignore list, skip synonym checks 
        if word in ignoreStrings != 0:
            syns = [word]
        else:
            # check that the synonym list isn't empty
            if(len(allSyns) != 0):       
            # result is a single list containing all synonyms
                result = sum(allSyns, [])

                # remove elements that appear more than once
                for syn in result:
                    if syn in allSyns != -1:
                        print ""
                    else:
                        syns.append(syn)
            else:
                syns = [word]

        # make a random selection
        newWord = choice(syns)
        newLine.append(newWord)

    theLine = ' '.join(newLine)
    print theLine