LetterWalk.cgi: Difference between revisions

From XPUB & Lens-Based wiki
Line 196: Line 196:
</source>
</source>


== Version 5: Letter cloud for the "next letters" ==
== Final Version: Letter cloud for the "next letters" ==


Finally, we want the letter cloud to reflect the relative frequency of the "next letters" so that as someone types, the cloud actually reflects the next possible letters.
Finally, we want the letter cloud to reflect the relative frequency of the "next letters" so that as someone types, the cloud actually reflects the next possible letters.

Revision as of 16:04, 9 February 2009

We start with a simple loop to read the contents of a wordlist. In this case, I am using the "words.txt" file that is part of the ThinkPython exercises (part of the "swampy" files you can download).

Version 1: Just display the words (hello world)

To start, we take the minimal python cgi template and add a simple loop that displays the full contents of the file. Note that "words.txt" is a simple kind of database, with one word per line, ordered alphabetically. In many projects, a simple database in the form of a text file makes a good way to quickly get started to produce a workable interface, allowing for a quick test of interactivity.

#!/usr/bin/python
 
print "Content-type: text/html; charset=utf-8"
print

print "<html><head><title>letterwalk</title></head><body>"

f = open("words.txt", "r")
for w in f:
  print w.strip()

print "</body></html>"

Version 2: Select-a-letter

We add the ability to select and display only those words beginning with a particular letter. We add a variable to the "state" of the page, named 'letter' to represent the selected letter, and use this value to filter the display of the list.

A "legend" is displayed at the top of the page to allow letters to be selected, and also to show the current selection (the current letter gets highlighted). This is a typical pattern in CGI programming.

In creating the links, note the use of Triple Quotes, and Python's String Formatting Operator. These features are extremely useful in generating "template-based" text like HTML.

#!/usr/bin/python
 
import cgi

fs = cgi.FieldStorage()
# get the letter variable from cgi.FieldStorage, default value is a
letter = fs.getvalue("letter", "a")

print "Content-type: text/html; charset=utf-8"
print

print "<html><head><title>letterwalk</title></head><body>"

# Clickable letters with current letter bold
print "<p>"
letters = "abcdefghijklmnopqrstuvwxyz"
for l in letters:
  if l == letter:
    print "<b>%s</b>" % l
  else:
    print """<a href="?letter=%s">%s</a>""" % (l, l)
print "</p>"

# Show only those words that start with the current letter
f = open("words.txt", "r")
for w in f:
  if w[0] == letter:
    print w.strip()
  elif w[0] > letter:
    break

print "</body></html>"

Version 3: LeTterCLouD

What if we were to indicate the number of words graphically. A simple way would be to let the size of the letter be related to the number of words in the list that start with that word. This is a basic kind of "popularity" scaling typical in "tag clouds".

To start, we need to actually do the counting to get the "data" we need to make the sizes. ... Recall the "histogram" example from ThinkPython, exercises 11.2 - 11.5.

Counting words per letter

So now let's use python to count how many words begin with each letter, and display the results in very simple HTML.

#!/usr/bin/python

print "Content-type: text/html; charset=utf-8"
print

print "<html><head><title>letterwalk</title></head><body>"

lettercounts = {}
f = open("words.txt", "r")
for w in f:
  l = w[0]
  lettercounts[l] = lettercounts.get(l, 0) + 1

letters = lettercounts.keys()
letters.sort()
for l in letters:
  print l, lettercounts[l], "<br />"

print """
</body>
</html> 
"""

Visualizing the data

To do the scaling of the text size, we will use the concept of rank to calculate the font size of each letter.

#!/usr/bin/python

print "Content-type: text/html; charset=utf-8"
print

print "<html><head><title>letterwalk</title></head><body>"

lettercounts = {}
f = open("words.txt", "r")
for w in f:
  l = w[0]
  lettercounts[l] = lettercounts.get(l, 0) + 1

letters = lettercounts.keys()
letters.sort()

(min_count, max_count) = (None, None)
for l in letters:
  if min_count == None or lettercounts[l] < min_count:
    min_count = lettercounts[l]
  if max_count == None or lettercounts[l] > max_count:
    max_count = lettercounts[l]

(min_size, max_size) = (8, 96)
for l in letters:
  rank = float(lettercounts[l] - min_count) / (max_count - min_count)
  size = int(min_size + rank * (max_size - min_size))
  print """<span style="font-size: %dpx">%s</span>""" % (size, l)

print "</body></html>"

Version 4: Type by letter

We now want to allow the user to actually type a word by clicking on the letters. The simplest way to do this is to use two variables for the "state" of the page, one to reflect the "current word" (the result of all previous typing), and again a variable to reflect the single letter that the user has (just) typed. In this case we'll call this "addletter" to emphasize the idea that when the user selects a letter, it gets added onto the current word.

We add the current word to the display of the page.

#!/usr/bin/python
 
import cgi

fs = cgi.FieldStorage()
curword = fs.getvalue("curword", "")
addletter = fs.getvalue("addletter", None)

if addletter:
  curword += addletter

print "Content-type: text/html; charset=utf8"
print

print """<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>letterwalk</title>
</head>
<body>
"""

print "<h2>%s</h2>" % curword

lettercounts = {}
f = open("words.txt", "r")
for w in f:
  l = w[0]
  lettercounts[l] = lettercounts.get(l, 0) + 1

letters = lettercounts.keys()
letters.sort()

(min_count, max_count) = (None, None)
for l in letters:
  if min_count == None or lettercounts[l] < min_count:
    min_count = lettercounts[l]
  if max_count == None or lettercounts[l] > max_count:
    max_count = lettercounts[l]

(min_size, max_size) = (8, 96)
for l in letters:
  rank = float(lettercounts[l] - min_count) / (max_count - min_count)
  size = int(min_size + rank * (max_size - min_size))
  print """<a href="?curword=%s&addletter=%s"><span style="font-size: %dpx">%s</span></a>""" % (curword, l, size, l)

print """
</body>
</html> 
"""

Final Version: Letter cloud for the "next letters"

Finally, we want the letter cloud to reflect the relative frequency of the "next letters" so that as someone types, the cloud actually reflects the next possible letters.

Also, note the addition of a "start over" link (using href="?") to clear the current state of the game back to the beginning.

#!/usr/bin/python
 
import cgi
import cgitb; cgitb.enable()

fs = cgi.FieldStorage()
curword = fs.getvalue("curword", "")
addletter = fs.getvalue("addletter", None)

if addletter:
  curword += addletter

print "Content-type: text/html; charset=utf8"
print

print """<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
 
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>letterwalk</title>
</head>
<body>
"""

print "<h2>%s</h2>" % curword

lettercounts = {}
f = open("words.txt", "r")
for w in f:
  w = w.strip()
  if curword == "":
    l = w[0]
    lettercounts[l] = lettercounts.get(l, 0) + 1
  else:
    if w.startswith(curword):
      # print w
      if len(w) > len(curword):
        l = w[len(curword)]
        lettercounts[l] = lettercounts.get(l, 0) + 1

letters = lettercounts.keys()
letters.sort()

(min_count, max_count) = (None, None)
for l in letters:
  if min_count == None or lettercounts[l] < min_count:
    min_count = lettercounts[l]
  if max_count == None or lettercounts[l] > max_count:
    max_count = lettercounts[l]

(min_size, max_size) = (8, 96)
for l in letters:
  if min_count == max_count:
    rank = 1.0
  else:
    rank = float(lettercounts[l] - min_count) / (max_count - min_count)
  size = int(min_size + rank * (max_size - min_size))
  print """<a href="?curword=%s&addletter=%s"><span style="font-size: %dpx">%s</span></a>""" % (curword, l, size, l)

print """<p><a href="?">start over</a></p>"""

print """
</body>
</html> 
"""