Eliza / Doctor: Difference between revisions

From XPUB & Lens-Based wiki
 
(17 intermediate revisions by the same user not shown)
Line 1: Line 1:
Famous "mother of all chatbot" programs by [[wikipedia:Joseph Weizenbaum]]. Eliza was in fact a generic engine for producing / conducting interactive dialogues. The "DOCTOR" script became the legendary example of a "Rogerian" psychologist employing mirroring technique to converse with its user as patient.  
Famous "mother of all chatbot" programs by [[wikipedia:Joseph Weizenbaum|Joseph Weizenbaum]]. The original project comprises in fact two parts, a generic engine for producing / conducting interactive dialogues and the "DOCTOR" script, which contained rules that emulate (or parody depending on your reading) a "Rogerian" psychologist, employing a mirroring technique to respond and provoke back and forth interations with its user cast in the role of a visiting patient. Weizenbaum published the combined work in [http://web.stanford.edu/class/linguist238/p36-weizenabaum.pdf Computational Linguistics: ELIZA--A Computer Program For the Study of Natural Language Communication Between Man and Machine]. The article includes extensive aspects of the implementation in addendum form.  


* [http://web.stanford.edu/class/linguist238/p36-weizenabaum.pdf Original publication : Computational Linguistics: ELIZA--A Computer Program For the Study of Natural Language Communication Between Man and Machine]
<blockquote>
* online/javascript implementation -- https://www.masswerk.at/elizabot/
ELIZA is a program operating within the MAC time-sharing system at MIT which makes certain kinds of natural language conversation between man and computer possible. Input sentences are analyzed on the basis of decomposition rules which are triggered by key words appearing in the input text. Responses are generated by reassembly rules associated with selected decomposition rules. The fundamental technical problems with which ELIZA is concerned are: (1) the identification of key words, (2) the discovery of minimal context, (3) the choice of appropriate transformations, (4) generation of responses in the absence of key words, and (5) the provision of an editing capability for ELIZA "scripts". A discussion of some psychological issues relevant to the ELIZA approach as well as of future developments concludes the paper.<br>
&mdash; Abstract from Weizenbaum's original [http://web.stanford.edu/class/linguist238/p36-weizenabaum.pdf publication]
</blockquote>


The original source code seems to be lost (?). A faithful implmentation was made by Charles Hayden in the Java language and included in the CD-ROM of the MIT Press publication: New Media Reader.
The original program was written in a [https://en.wikipedia.org/wiki/SLIP_(programming_language) MAD-SLIP], a language developed by Weizenbaum at the [https://en.wikipedia.org/wiki/MIT_Computer_Science_and_Artificial_Intelligence_Laboratory MIT Artifical Intelligence Laboratory] in the 1960s. A faithful implementation was made in the 1990s by Charles Hayden in the Java language and included in the CD-ROM insert of the MIT Press publication: [http://www.newmediareader.com/ The New Media Reader]. Norber Landsteiner has made a [https://www.masswerk.at/elizabot/ 2005 implmentation in Javascript] that runs (as of 2020) in a web browser and appears based it seems on the details from the publication.


== Rules ==
== Rules ==


(from Charles Hayden's 199x Java implmentation)
The README file included in Charles Hayden's Java implementation provides a useful and concise summary description of Weizenbaum's program.


<pre>
<pre>
Line 38: Line 40:
Sixth, a set of post-substitutions takes place.
Sixth, a set of post-substitutions takes place.
Finally, the resulting sentence is displayed as output.
Finally, the resulting sentence is displayed as output.
The script is used to construct the pre and post substitution lists, the
keyword lists, and the decomposition and reassembly patterns. 
In addition, there is a synonym matching facility, which is explained below.
Every line of script is prefaced by a tag that tells what list it is
part of.  Here is an explanation of the tags.
initial: Eliza says this when it starts.
final: Eliza says this when it quits.
quit: If the input is this, then Eliza quits.  Any number permitted.
pre: Part of the pre-substitution list.  If the first word appears in
    the sentence, it is replaced by the rest of the words.
post: Part of the post-subsititution list.  If the first word appears
    in the sentence, it is replaced by the rest of the words.
key: A keyword.  Keywords with greater weight are selected in
    preference to ones with lesser weight.
    If no weight is given, it is assumed to be 1.
decomp: A decomposition pattern.  The character * stands for any
    sequence of words. 
reasmb: A reassembly pattern.  A set of words matched by * in
    the decomposition pattern can be used as part of the reassembly.
    For instance, (2) inserts the words matched by the second *
    in the decomposition pattern.
synon: A list of synonyms.  In a decomposition rule, for instance, @be
    matches any of the words "be am is are was" because of the line:
"synon: be am is are was".  The match @be also counts as a *
    in numbering the matches for use by reassembly rules.
Other Special Rules
If a $ appears first in a decomposition rule, then the output is formed as
normal, but is saved and Eliza goes on to the next keyword.  If no keywords
match, and there are saved sentences, one of them is picked at random and
used as the output, then it is discarded.
If there are no saved sentences, and no keywords match, then it uses the
keyword "xnone".
</pre>
</pre>



Latest revision as of 11:06, 11 October 2020

Famous "mother of all chatbot" programs by Joseph Weizenbaum. The original project comprises in fact two parts, a generic engine for producing / conducting interactive dialogues and the "DOCTOR" script, which contained rules that emulate (or parody depending on your reading) a "Rogerian" psychologist, employing a mirroring technique to respond and provoke back and forth interations with its user cast in the role of a visiting patient. Weizenbaum published the combined work in Computational Linguistics: ELIZA--A Computer Program For the Study of Natural Language Communication Between Man and Machine. The article includes extensive aspects of the implementation in addendum form.

ELIZA is a program operating within the MAC time-sharing system at MIT which makes certain kinds of natural language conversation between man and computer possible. Input sentences are analyzed on the basis of decomposition rules which are triggered by key words appearing in the input text. Responses are generated by reassembly rules associated with selected decomposition rules. The fundamental technical problems with which ELIZA is concerned are: (1) the identification of key words, (2) the discovery of minimal context, (3) the choice of appropriate transformations, (4) generation of responses in the absence of key words, and (5) the provision of an editing capability for ELIZA "scripts". A discussion of some psychological issues relevant to the ELIZA approach as well as of future developments concludes the paper.
— Abstract from Weizenbaum's original publication

The original program was written in a MAD-SLIP, a language developed by Weizenbaum at the MIT Artifical Intelligence Laboratory in the 1960s. A faithful implementation was made in the 1990s by Charles Hayden in the Java language and included in the CD-ROM insert of the MIT Press publication: The New Media Reader. Norber Landsteiner has made a 2005 implmentation in Javascript that runs (as of 2020) in a web browser and appears based it seems on the details from the publication.

Rules

The README file included in Charles Hayden's Java implementation provides a useful and concise summary description of Weizenbaum's program.

How Eliza Works

All the behavior of Eliza is controlled by a script file.
The standard script is attached to the end of this explanation.

Eliza starts by reading the script file.  Because of Java security, it
must be on the same server as the class files.  Eliza then reads a line at
a time from the user, processes it, and formulates a reply.

Processing consists of the following steps.
First the sentence broken down into words, separated by spaces.  All further
processing takes place on these words as a whole, not on the individual
characters in them.
Second, a set of pre-substitutions takes place.
Third, Eliza takes all the words in the sentence and makes a list of all
keywords it finds.  It sorts this keyword list in descending weight.  It
process these keywords until it produces an output.
Fourth, for the given keyword, a list of decomposition patterns is searched.
The first one that matches is selected.  If no match is found, the next keyword
is selected instead.
Fifth, for the matching decomposition pattern, a reassembly pattern is
selected.  There may be several reassembly patterns, but only one is used
for a given sentence.  If a subsequent sentence selects the same decomposition
pattern, the next reassembly pattern in sequence is used, until they have all
been used, at which point Eliza starts over with the first reassembly pattern.
Sixth, a set of post-substitutions takes place.
Finally, the resulting sentence is displayed as output.

The script is used to construct the pre and post substitution lists, the
keyword lists, and the decomposition and reassembly patterns.  
In addition, there is a synonym matching facility, which is explained below.

Every line of script is prefaced by a tag that tells what list it is 
part of.  Here is an explanation of the tags.

initial:	Eliza says this when it starts.
final:		Eliza says this when it quits.
quit:		If the input is this, then Eliza quits.  Any number permitted.
pre:		Part of the pre-substitution list.  If the first word appears in 
		    the sentence, it is replaced by the rest of the words.
post:		Part of the post-subsititution list.  If the first word appears 
		    in the sentence, it is replaced by the rest of the words.
key:		A keyword.  Keywords with greater weight are selected in 
		    preference to ones with lesser weight.
		    If no weight is given, it is assumed to be 1.
decomp:		A decomposition pattern.  The character * stands for any 
		    sequence of words.  
reasmb:		A reassembly pattern.  A set of words matched by * in 
		    the decomposition pattern can be used as part of the reassembly.
		    For instance, (2) inserts the words matched by the second * 
		    in the decomposition pattern.
synon:		A list of synonyms.  In a decomposition rule, for instance, @be
		    matches any of the words "be am is are was" because of the line:
			"synon: be am is are was".  The match @be also counts as a *
		    in numbering the matches for use by reassembly rules.

Other Special Rules
If a $ appears first in a decomposition rule, then the output is formed as
normal, but is saved and Eliza goes on to the next keyword.  If no keywords
match, and there are saved sentences, one of them is picked at random and
used as the output, then it is discarded.
If there are no saved sentences, and no keywords match, then it uses the
keyword "xnone".


Other similar (but not complete) eliza implementations: