Category:Enguage: Difference between revisions

m
 
(20 intermediate revisions by the same user not shown)
Line 1:
{{language|Enguage}}
Enguage is a speech understanding algorithm, implemented in plain Java, which supports the idea that speech is Turing complete, so respectfully requests a place on Rosetta Code.
It is informed by linguistic theory, in particular Pragmatism, Ordinary Language Philophy and Speech Act theory.
[https://en.wikipedia.org/wiki/Pragmatic_maxim Pragmatism],
These were developed in the latter half of the 20th Century: at the same time as the rise of traditional context-free programming languages.
[https://en.wikipedia.org/wiki/Ordinary_language_philosophy Ordinary Language Philosophy] and
[https://en.wikipedia.org/wiki/Speech_act Speech Act theory].
The latter two were developed in the second half of the 20th Century, that is, at the same time as the rise of traditional context-free programming languages.
 
While the examples here are given in English, it can be applied to any natural language.
While untested, it should work with Unicode characters so it could be used to implement [https://en.wikipedia.org/wiki/John_Searle John Searle] 's [https://en.wikipedia.org/wiki/Chinese_room Chinese Room].
[https://en.wikipedia.org/wiki/Chinese_room Chinese Room].
It is therefore the interpreter, and not the language, to which Enguage refers.
Enguage, therefore, refers to the interpreter and not the language which it interprets.
It ''can'' be used to process data, but it is not efficient in doing this.
 
It ''can'' be used to process data, but it is not efficient in doing this;
and it is largely unsuitable for interpreting writing, as this is not a discourse.
 
It is largely unsuitable for interpreting writing, as this is not a discourse.
==Background==
"Enguage" iswas adeveloped portmanteauin ofC thebetween words Language2011 and Engine2013, -and hencein itsJava unconventionalever spellingsince.
It remains experimental; however,
It is being actively developed and has been since 2011.
Itit won the [http://www.bcs-sgai.org/micomp/pastcomps.php British Computer Society's Machine Intelligence Competition] in 2016.
The name "Enguage" is a portmanteau of the words Language and Engine - hence its unconventional spelling.
 
[http://bitbucket.org/martinwheatman/enguage Enguage source code repo] If you have '''make''' and '''git''' installed, Enguage can be downloaded, created,from andthe runsource incode three ways, thus:repo
<ref name="src">[http://bitbucket.org/martinwheatman/enguage the source code repo]</ref>,
built, and run in three ways, thus:
<pre>
$ git clone https://bitbucket.org/martinwheatman/enguage.git
$ cd enguage
$ make jar
$ export PATH=$PATH:./sbin # the full test suite calls scripts in ./sbin
$ java -jar lib/enguage.jar -t
$ java -jar lib/enguage.jar -T hello
Line 30 ⟶ 38:
Repertoires, each supporting a concept are under '''etc/rpts''' and unit tests are under '''etc/test'''.
An 'active' dictionary, of entries with an embedded unit test, is under '''etc/dict'''.
The interpretation of language can also be supplied by utterance, i.e. by voicespeech, and some of the written specifications usedemostrate this style.
 
==Linguistic Influence==
Enguage draws inspiration from Semiotics and linguistic theory and is an attempt to implement [https://en.wikipedia.org/wiki/Ordinary_language_philosophy Ordinary Language Philosophy].
This isrejects rejectionthe ofidea the Structuralistthere approachis toan meaningunderlying inmathematical logic to language;.
The schism between the logic and semiotics is reflected in the dyadic and triadic sign models.
that there is not an underlying mathematical logic to the science of meaning.
Semiotics is the study of signs, and a sign is simply the atomic element of meaning.
The schism in the meaning of meaning originated in the dichotomy between dyadic and triadic sign models.
This is a brief synopsis of the influences.
Semiotics is the study of signs, and a sign is simply the atomic element in meaning.
This is a brief synopsis.
===Pragmatism===
[https://en.wikipedia.org/wiki/Charles_Sanders_Peirce Charles Sanders Peirce] also devisesdevised the philosophy of Pragmatism, that things are defined by their effect, which is summarised in his [https://en.wikipedia.org/wiki/Pragmatic_maxim Pragmatic Maxim]
===Dyadic Semiology===
[https://en.wikipedia.org/wiki/Ferdinand_de_Saussure Ferdinand de Saussure] created the first synchronic model of language - how a language works at any one point in time.
[https://en.wikipedia.org/wiki/Course_in_General_Linguistics#Syntagmatic_and_paradigmatic_relations the first synchronic model of language] - how a language works at any one point in time.
He described a dyadic sign at the heart of this Semiology.
Here, a ''signifier'', a written or spoken artefact, ''signifies'' a mental image.
There is an '''arbitrary''' link between what it said and its signified mental image.
 
===Triadic Semiotics===
Around the same time, but independently,
Charles Sanders Peirce devised a triadic model, which is composed of:
a sign vehicle, or ''Representamen'';
a referent ''Object'' to which the sign vehicle refers; and,
the necessary reasoning within the mind to make the connection between the two, ''Interpretant''.
Interpretant is always presentTherefore, andthe thustriadic thissign is a model of subjectivity.
 
Peirce describes the object being referenced in one of three ways, and as symbolic information, language works at the third level.
Line 58 ⟶ 67:
and at the second level there are many instances of this, such on a hot tap or stop switch on a machine,
but it is at the third level in which we propagate this as an idea by saying, "red means danger".
 
===The Meaning of Meaning===
[[File:Semiotic Triangle.png|thumb|right|alt=An equilateral triangle is labelled with Symbol to the left, Object to the right, and Thought or Reference to the top apex. The base line is dotted to signify the implied relationship between the Symbol and Object is only achieved through the Thought or Reference of the interpreter.|The Semiotic Triangle of Reference, figure taken from page 11 of The Meaning of Meaning.]]
British linguists C. K. Ogden and I. A. Richards, publishedwrote The Meaning of Meaning <ref name="MoM">[http://courses.media.mit.edu/2004spring/mas966/Ogden%20Richards%201923.pdf The Meaning of Meaning]</ref> in 1923,
which draws on Peirce's Semiotics, illustratingand illustrates the functioning of speech asin the aSemiotic triangle,Triangle.
whereby aThe Symbol, hasbottom anleft corner, implies the referent Object, bottom right, through the dotted baseline; but, that connection is only ever made through the processtop apex, ofby thinking.
 
TheFurther, the Symbol has a 1:1 relationship with the thoughtsinterpretation, but those Thoughts or Reference may refer to one or more objects.
This is an illustrates the difference between an arithmetical function, which has one return value, and that of a programming language which, due to its conditional processing quality, may have one of several replies.
 
Ogden and Richards highlighted the symbolic nature of speech: "Words, as everyone now knows, 'mean' nothing by themselves, although the belief that the did, ..., was once equally universal" (pp.9-10)
 
===Gödel Numbering===
In his [https://en.wikipedia.org/wiki/G%C3%B6del's_incompleteness_theorems Incompletenes Theorum], Kurt Gödel devised a numbering scheme, [https://en.wikipedia.org/wiki/G%C3%B6del_numbering Gödel Numbering], as a representation of logical proofs, so as to show that there are statements that can be made in a system which cannot be proved by that system. This is used to express "this sentence is not provable", ultimately showing mathematics to be incomplete. This non-consecutive numbering system can be adapted to present a symbolic representations in linguistics.
 
===Speech Act Theory===
[https://en.wikipedia.org/wiki/J._L._Austin J. L. Austin] used his William James' Lecture, at Harvard in 1955, to critique the traditional analysis of language towards truth statements.
He introduce the idea of performative statmentsstatements which include:
what is uttered, locution;
what is meant by this, illocution: and,
Line 78 ⟶ 93:
This became codified in John Searle's Speech Act theory.
===Implicature===
[https://en.wikipedia.org/wiki/Paul_Grice H. Paul Grice's] William James' Lecture, at Harvard in 1957, tointroduced introducehis ideas on meaning outside of the traditional linguistics.
HeThis introducedincludes the idea that meaning, isin whata wider sense, is implicated by an utteranceutterances.
 
==Algorithm==
Enguage models utterances as symbols and replicates the idea of function illustrated in the Semiotic Triangle. As a symbol processing system, Enguage is grounded in symbols. Therefore, referent objects, and thoughts and references, are also symbols. Thoughts and references may also serve as symbols in further triads, like functions in a programming language may call other functions, and so there is a process of [https://en.wikipedia.org/wiki/Semiosis semiosis] occurring.
All Enguage does is to swap the user's '''utterance''', ''"hello"'', with one of the interpretation's '''replies''', ''"hello to you too"''.
It does this either directly, ''On "hello", reply "hello to you too".'', or by issuing (thinking?) further utterances and maintaining, and using, a replied answer and the status, or felicitous outcome, of that thought.
 
Symbols can be represented as numeric values. In software, this is typically as consecutive numbers, such as false=0 and true=1. In a simplified representation, a string may be represented numerically, thus: "a"=1, "b"=2, ... "z"=26, "aa"=27, and so on. A full utterance is merely an array of such values, to an arbitrary base, so ["i", "need", "a", "coffee"] is simply a large integer value, in much the same way as [1,2,3,4] might represent 1,234 in base 10. Enguage is not concerned with deconstructing symbols nor with defining what a symbol, ostensibly, means.
 
Enguage swaps two symbols: the user's '''utterance''', "''hello''", with one of the '''replies''' given in the interpretation, "''hello to you too''", but this is not a dyadic relationship.
It can be defined, directly, by the utterance ''On "hello", reply "hello to you too".''; but this instruction can be augmented by further implications, e.g. This implies ... .
There is always the intermediate thought: the symbol always refers to (or ''symbolises'' in the above diagram) the instructions on how to reply.
 
As a simple arithmetic example, "what is two plus two" may imply "4, 2 + 2 is 4", but can only through the thought "{2 + 2}". This unnamed function is known as a [https://en.wikipedia.org/wiki/Lambda_calculus lambda]. This is why linguistic is more complex than arithmetic, and perhaps why mathematics does not underpin language. Further utterance, the lambda, requires the maintenance, and use, of a replied answer (e.g. 4) and a socially defined format within which to place it (e.g. "ANSWER, UTTERANCE is ANSWER") and the status, or felicitous outcome, of that thought to direct the processing of the lambda.
 
===Ambiguity===
Each utterance has one or more interpretations <ref name="disamb">[http://dx.doi.org/10.1007/978-3-319-42102-5_16 A Pragmatic Approach to Disambiguation, ICISO, 2016]</ref>, which is the equivalent of a function in a traditional programming language.
Each function can be specified in a '''.txt''' file, or can be created by utterance, e.g. "''to the phrase hello reply hello to you too.''"
Line 90 ⟶ 113:
For example, a simple repertoire might be: ''i need a coffee'', ''i do not need a coffee'', ''do i need a coffee'' and ''what do i need''.
 
===Hooks===
The Turing complete quality of a function is that it is represented as a list of instructions.
As well as the 'on "..."' and 'reply "..."' imperatives, Enguage also has several other such 'hooks' to allow other operations available to the software to be called, such as perform "..." to access the Java classes, and run "..." to run an external command. That Enguage passes off processing to traditional software is regarded as little different to machine code operating an ALU to provide arithmetic operations.
 
===Turing Complete Lambda===
The Turing complete quality of a function is that it is represented as a list of instructions, a [https://en.wikipedia.org/wiki/Lambda_calculus lambda].
These instructions can form loops, and can be conditionally operated, and are implemented by the implemented language, not in source code (i.e. within the interpreter).
The felicitous nature of a thought can be used by prefixing a subsequent thought with, for example, 'if so, ...', if the outcome is positive, and 'if not, ...' if it negative. This supplies the idea of conditional processing and recalling (recursion) is used to create loops, see the FizzBuzz example.
So as not to create ''reserved words'', these can be configured as required by the language.
This supplies the idea of conditional processing and recalling (recursion) is used to create loops,
see the [https://rosettacode.org/wiki/FizzBuzz#Enguage FizzBuzz] example.
Thus, interaction with Enguage is always given as a Turing complete discourse: utterance to reply.
 
===Hooks=Concepts==
A concept is supported by a repertoire of utterances.
As well as the 'reply "..."' imperative, Enguage also has several other such 'hooks' to allow other operations available to the software to be called, such as perform "..." to access the Java classes, and run "..." to run an external command. That Enguage passes off processing to traditional software is regarded as little different to machine code operating an ALU to provide arithmetic operations.
===''We''===
The concept of ''we'' is supported by [https://bitbucket.org/martinwheatman/enguage/src/develop/etc/dict/w/we.entry the following repertoire].
This file also includes unit tests for the concept, behind the #] symbol.
 
Perhaps quite subtly, you and i means that we is set to i.
This is because Enguage swaps the personal pronouns, internally, which perhaps needs fixing?
<pre>
On "we are you and i":
set the value of we to i;
reply "ok, we means you and i".
 
On "we are THEM and i":
set the value of we to THEM;
reply "ok, we means you and THEM".
 
On "who are we":
get the value of we;
if not, set the value of we to i;
reply "ok, we means you and ...".
</pre>
This concept is built upon the concept of setting/getting values which ultimately implemented by the Enguage Value class.
The truth of 'who ''we'' are' is set and used by utterance.
===''Holding Hands''===
The concept of we can be used in the concept of holding hands.
What follows is a fragment of the Holding Hand concept, the portion used in the If/Then concept.
 
Holding hands acn use teh we concept, by asking who are we, in the first line of interpretant.
There are two replies, depending on whether you and I are holding hand or I am holding someone else's hand.
The 4th line shows that Enguage still swaps words to an internal version, so it swaps personal pronouns: I/you.
This doesn't read correctly, and whould be address: thre is no ''ghost in the machine'' to be satisfied.
<pre>
On "we are holding hands":
who are we;
set someone to ...;
i am holding hands with SOMEONE;
is someone set to i;
if so, reply "ok, we are holding hands";
reply "ok, you are holding hands with SOMEONE".
</pre>
This value can then be retrieved.
There are three replies:
if I am not currently holding hands;
if I am holding a third person's hand; or,
if I am holding your hand.
<pre>
On "whose hand am i holding":
perform "link exists martin holdinghands";
if not, reply "sorry, you are not holding anyone's hand";
perform "link get martin holdinghands";
set someone to ...;
is someone set to i;
if so, reply "ok, you are holding my hand";
reply "ok, you are holding SOMEONE''s hand".
</pre>
 
==References==
49

edits