User:Alice/Code Exercises

From XPUB & Lens-Based wiki
< User:Alice
Revision as of 12:12, 28 January 2018 by Alice (talk | contribs)

Oulipo exercise

Improved the N + 7 code we wrote in Prototyping, by debugging and adding some tests. Work in progress.

Improvements

Bug example: the for loop was skipping capitalized words, because it could not find them in the input file with nouns.

To fix, I added the lower method for strings to turn all text to lowercase.


 for word in separated:
        word = word.lower() + '\n'


To test it, I added a test.


def test_seven():
    assert seven('Baboons') == 'babushkas'


Full script (in progress)


def seven(sentence):
    fpath = open('91K nouns.txt')
    nouns = fpath.readlines()
    separated = sentence.split()    
    #print(separated)
    new_separated = []
    for word in separated:
        word = word.lower() + '\n'
        if word in nouns:
            position = nouns.index(word)
            new_word = nouns[position + 7]
            #print(" replacing", new_position)
            new_separated.append(new_word.strip())
        else:
            #print("notinlist")
            #print("adding to new_separated ", word)
            new_separated.append(word.strip())
    #print(new_separated)
    return ' '.join(new_separated)

#sentence = input('What is your sentence? ')
#seven(sentence)

# pytest requires that you name your tests with test_<your-name>
# run with the 'pytest' command in your terminal
def test_seven():
    assert seven('Baboons') == 'babushkas'
    assert seven('Baboons,') == 'babushkas'

Tesseract exercise

Initial test to train tesseract to recognise an image as a character/word

First, using imagemagick, convert the jpg file into a tiff file, for better OCR results

convert -density 300 flower3.jpg -depth 8 -strip -background white -alpha off flower3.tiff

Using tesseract page segmentation -8 and -10, I tested it to see what kind of text output I would get when it considers the image as single character or as a word.

tesseract flower.tiff  -psm 8 output
tesseract flower.tiff  -psm 10 output2

results were

a

<23

I created a boxfile for the best result (with psm -10)

tesseract flower4.tiff -psm 10 flower4 makebox

I then opened the image/boxfile combination with moshpytt, and edited the content of the box, in order to recognise it as the word 'flower'.

Image: 700 pixels


python moshpytt.py


Python exercise inspired by the work of Carl Andre

I wrote a script that would turn a list of words of different lengths into a pattern similar to the one Carl Andre typed by hand with a typewriter. So far, all the tests pass. When receiving as input a long list of words, it raises a ValueError, which means it still needs debugging...


import pytest
from math import ceil


def grabber(words, numgrab):
    grabbedwords = []
    for number in range(numgrab):
        grabbedwords.append(words.pop(0))
    return (grabbedwords, words)


def pattern(words, maxlength):
    goodwords = []
    for word in words:right_one = ['a', 'aa', 'aaa', 'aa', 'a', 'b', 'bb', 'bbb', 'bb', 'b']
        if len(word) <= maxlength:
            goodwords.append(word)

    items_pattern = maxlength + (maxlength -1)
    if len(goodwords) % items_pattern != 0:
        raise ValueError

    times = int(len(words) / items_pattern)
    final_pattern = []
    for each_time in range(times):
        grabbed, whatisleft = grabber(goodwords, items_pattern)
        goodwords = whatisleft
        middle = ceil(len(grabbed)/2)
        sorted_pattern = (
            sorted(grabbed[:middle]) +
            sorted(grabbed[middle:], reverse=True)
        )
        final_pattern.append(sorted_pattern)

    return final_pattern

def test_pattern_returns_list():
    assert type(pattern(['a', 'b', 'c', 'd', 'e'], 3)) == type([])

def test_pattern_removes_over_max_len():
    right_one = [['a', 'aa', 'aaa', 'aa', 'a']]
    assert pattern(right_one[0] + ['aaaaa'], 3) == right_one

def test_pattern_too_short_wont_work():
    with pytest.raises(ValueError):
        pattern(['a', 'aa'], 3)

def test_grabber():
    assert grabber(['a', 'aaa'], 1) == (['a'], ['aaa'])

def test_two_patterns():
    right_one = ['a', 'aa', 'aaa', 'aa', 'a', 'b', 'bb', 'bbb', 'bb', 'b']
    result = [['a', 'aa', 'aaa', 'aa', 'a'], ['b', 'bb', 'bbb', 'bb', 'b']]
    assert pattern(right_one, 3) == result