16-03-2011 Laura Amy Laurier: Difference between revisions
Amy Suo Wu (talk | contribs) (Created page with "== Simple Statistics == Legal terminology present the language used in terms and conditions policies often We want to highlight the ambiguity of legal terminology") |
No edit summary |
||
Line 3: | Line 3: | ||
Legal terminology present the language used in terms and conditions policies often | Legal terminology present the language used in terms and conditions policies often | ||
We want to highlight the ambiguity of legal terminology | We want to highlight the ambiguity of legal terminology | ||
Word frequency distribution | |||
<source code="python"> | |||
from nltk import FreqDist | |||
from matplotlib import * | |||
import urllib2 | |||
t = "** 20.1 ** SITE shall not be responsible for any failure to perform due to unforeseen circumstances or to causes beyond our reasonable control, including but not limited to: acts of God, such as fire, flood, earthquakes, hurricanes, tropical storms or other natural disasters; war, riot, arson, embargoes, acts of civil or military authority, or terrorism; fiber cuts; strikes, or shortages in transportation, facilities, fuel, energy, labor or materials; failure of the telecommunications or information services infrastructure; hacking, SPAM, or any failure of a computer, server or software, including Y2K errors or omissions, for so long as such event continues to delay the SITE's performance. " | |||
words = t.split() | |||
fdist = FreqDist(words) | |||
voc = fdist.keys() | |||
print voc[:10] | |||
#fdist.plot(50, cumulative=True) | |||
</source> |
Revision as of 13:24, 16 March 2011
Simple Statistics
Legal terminology present the language used in terms and conditions policies often We want to highlight the ambiguity of legal terminology
Word frequency distribution
from nltk import FreqDist
from matplotlib import *
import urllib2
t = "** 20.1 ** SITE shall not be responsible for any failure to perform due to unforeseen circumstances or to causes beyond our reasonable control, including but not limited to: acts of God, such as fire, flood, earthquakes, hurricanes, tropical storms or other natural disasters; war, riot, arson, embargoes, acts of civil or military authority, or terrorism; fiber cuts; strikes, or shortages in transportation, facilities, fuel, energy, labor or materials; failure of the telecommunications or information services infrastructure; hacking, SPAM, or any failure of a computer, server or software, including Y2K errors or omissions, for so long as such event continues to delay the SITE's performance. "
words = t.split()
fdist = FreqDist(words)
voc = fdist.keys()
print voc[:10]
#fdist.plot(50, cumulative=True)