User:Manetta/wordnet/wordnetwords

From XPUB & Lens-Based wiki

WordNet

WordNet is an English lexical database, created in 1985 in the Cognitive Science Laboratory of the Princeton University.

kernal

  • versions:
3.1 (?)
3.0 (2006)
2.1 (2005)
1.7.1 (?)
1.7 (?)
1.6 (?)
1.5 (?)
1.4 (?)
1.3 (?)
1.2 (1992?)
  • license: BSD
  • initiated by George Armitage Miller and developed untill currently by Christiane Fellbaum.


  • 3 hypotheses from which WordNet started:

Wordnet-an-electronic-lexical-database-1998-forword-separability-hypothesis.png
separability-hypothesis

Wordnet-an-electronic-lexical-database-1998-forword-patterning-hypothesis.png
patterning-hypothesis

Wordnet-an-electronic-lexical-database-1998-forword-comprihensiveness-hypothesis.png
comprihensiveness-hypothesis

WordNet as cultural object (?)

[universal language]
WordNet is created very much with the presence of machines in mind. And so WordNet could be seen in line of multiple attempts to create an universal language, more specifically: an universal language which would make human-machine communication possible. Therefore it also provides possibilities for human-machine-machine-machine-human communication.

Opposed to John Wilkins' idea for an universal language, in which words would be built from semantic atoms, WordNet is deriving meaning from a network-based structure. *'IS-A-KIND-OF' is a semantic relation*[1]. Though, when initiating WordNet, George Miller came from a 9-year-long research project in which the attempt was close to such atomic idea.

[leading]
As WordNet is a dataset which is prepared to function within various scientific pieces of software, it is the leading source in the range of lexical datasets. From the beginning of the development of WordNet, the makers where very consious of the fact that a relational network structure would enable them to let WordNet avoid scaling problems.

[applications]
One application of WordNet is Pattern, a text mining application in the form of a python library. WordNet is located at ./pattern-2.6/pattern/text/en/wordnet/dict as 12 textfiles. The textfiles contain the data- & index files of adjectives, adverbs, nouns and verbs, accompanied with a definition, id-number, and sometimes supplemented with an example sentence, to show the term in context. The WordNet definitions are used by Pattern within the en-sentiment.xml file, where terms are annotated on their level of positiveness.

Pattern also provides WordNet in the Pattern.en module, for Natural Language Processing (NLP) purposes. The makers write (on clips.ua.ac.ne/pattern): "Because language is ambigious (e.g. I can <--> a can) it uses statistical approaches + regular expressions". The Pattern.en.wordnet module makes it possible to search for related words, descriptions, synonyms and available word senses.


[1]: WordNet - an electronic lexical database (1998) — forword, xvi

WordNet in action

  • WordNet is used for natural language processing purposes.
  • difference between MRD / NLP :
  • machine readible dictionary (MRD) → dictionary which was printed before, but now electronic
  • natural language processing (NLP) → dictionary made from scratch, with an NLP purpose
  • "Search engines may use either a vocabulary, a taxonomy or an ontology to optimise the search results." (from)


included as dictionary package (not a complete list)

  • Pattern → a web mining module for the Python programming language.
  • GoldenDict → a computer open-source dictionary program
  • Lingoes → a single-click multi-lingual translation software program


WordNet elements — *highlights*

'entity'

the highest level of abstraction in WordNet: 'entity'
entity -- (that which is perceived or known or inferred to have its own distinct existence (living or nonliving))

Mb-WordNet-entity-01.png


teleological links

http://wordnetcode.princeton.edu/standoff-files/teleological-links-README.txt

RELATION:	DESCRIPTION:
action		Describes the typical intended activity (purpose) which
		the artifact was designed for. 
		e.g., a bed is intended for (ACTION) sleeping

For this typical intended activity, there are 11 roles used to describe it.
Note these relations are between the artifact's activity (not the artifact)
and the object mentioned.

RELATION:	DESCRIPTION:
agent		a rester is a (typical) AGENT of sleeping on a bed
beneficiary	an audience is a (typical) BENEFICIARY of showing a movie
cause		tiredness is a (typical) CAUSE of sleeping on a bed
destination	a shore is a (typical) DESTINATION of sailing a boat
experiencer	a child is a (typical) EXPERIENCER of swinging on a swing
instrument	a gun is a (typical) INSTRUMENT of shooting a bullet
location	a bedroom is a (typical) LOCATION of sleeping on a bed
result		rest is a (typical) RESULT of sleeping on a bed
source		a shore is a (typical) SOURCE of sailing a boat
theme		a passenger is a (typical) THEME of transporting by boat
undergoer	a target is a (typical) UNDERGOER of shooting an arrow


Mb-WordNet-teologicals-proto.png


WordNet alive

Mb-WordNet-alive-tweets-01.png

Mb-WordNet-alive-tweets-02.png

Mb-WordNet-alive-tweets-03.png

Mb-WordNet-alive-tweets-04.png

Mb-WordNet-alive-tweets-05.png