You are on page 1of 52

The Hercules Parser

Patrick A. Cameron Waikato University Student patrickcameron@hotmail.com

Abstract The Hercules parser provides a simple program for the investigation and application of patterns to discourse. The parser provides flexibility through scripting, whilst allowing an operator to be able to understand the frames and concepts without a great deal of knowledge of computer programming languages and scripts. The SQL and XML integrations allow for well known query languages to form the basis for data extraction and manipulation. Wordnet is used as a basis for forming concepts and patterns thereof, allowing a vast number of already established relationships to be explored using techniques such as self organizing maps and cluster analysis. Semantic roles, attributes, and hyponym hierarchies play a key role in word sense disambiguation where pattern recognition forms underlying concepts which are reinforced by logical statements with the aim of reaching a true Artificial Intelligence, simulated by a computer.

Contents Abstract..1 List of Figures....4 List of Tables.6 Acknowledgements....7 Section 1: Introduction....8 1.1 Context.....8 1.2 Exposition Goals..8 1.3 Motivation9 1.4 Report Chapters9 Section 2: Background...10 2.1 Other attempts10 2.1 Other attempts10 2.2 A Conceptual Parser for Natural Language...10 2.3 Conceptual Dependency and Montague Grammar: A step toward conciliation10 2.4 Schank/Riesbeck vs. Norman/Rumelhart: Whats the difference?........................10 2.5 How a Neural Net Grows Symbols10 2.6 A hybrid Approach to Word Sense Disambiguation: Neural Clustering with Class Labelling...10 2.7 A Generative Model for Semantic Role Labelling.11 2.8 Unsupervised Semantic Role Labelling.11 Section 3: System Overview..12 3.1 The Hercules AI Parser..12 3.2 Input / Output.12 3.3 Wordnet 1.17 Searches...12 3.4 The Hercules Data Interface...12 3.5 Sentence Structures and Base Node Hierarchies13 3.51 Sentence Structures Created from User Input..14 3.52 The Frame Engine and Node Hierarchies15 3.6 Query Engine.16 3.61 Frame Queries and Statistics18 3.62 Frame Queries with Formulas..20 3.7 The AI Mind of Hercules...23 3.8 Forming basic concepts..23 3.9 Abstraction of Concepts.25 3.10 Concepts, purpose, reason and goals27 3.11 Concepts forming reason of a living organism....27 3.12 Learning through abstract concepts.27 3.13 Underlying conceptual schemas and schema limitations28 3.14 Schemas based upon CD theory..29 3.15 Critical Reasoning31 3.16 Hierarchy for reasoning33 3.17 Database Structures for reasoning ...33 3.18 Script Formula examples for Critical Reasoning.34 3.19 Fact and Truth Corrections of the Databases...36 3.20 Setting the Database Data.37 3.21 Abstractions of real concepts for Analogy...37

3.22 Limitations on Abstract Concepts41 3.23 Forming Analogies...43 3.24 Corrections and limitations on Analogy...45 3.25 Truth and Weight in analogy48 Section 4: Experimentation, Results and Analysis..50 Section 5: Future Work.51 5.1 Algorithm Design.51 5.2 Hercules Parser Enhancements..........51 5.3 Reaching the goal of True Artificial Intelligence..51 Section 6: Concluding Remarks52

List of Figures Figure 3.1: The Hercules AI Parser components and subsystems..12 Figure 3.2: Relationships between the Hercules Data Interface (HDI) and the data collected from Wordnet12 Figure 3.3: sWord class object, pointers, attributes and metadata relationships14 Figure 3.4: Sentence node hierarchies tokenized using white spaces populated with resultant HDI search data...15 Figure 3.5: Frame sWord object node hierarchies with frame metadata including formula data structures16 Figure 3.6: Query creation process and commands.17 Figure 3.7: Flow chart for concept wave / fragment section boundaries..19 Figure 3.8: sWord data structure updates and construction using the query stack execution processes for testing and setting bit defined data attributes..22 Figure 3.8: Objects, Attributes, Actions, Distance, Time, Position, Actor, and Witness.23 Figure 3.9: Witnessing events in conversation assist in experience, learning and expectation..24 Figure 3.10: Wordnet Hyponym hierarchies of the statement in figure 3.9..25 Figure 3.12: Hercules is able to link existing frames to create a new pattern based on user input and store for later reference26 Figure 3.13: Hercules will add weight to patterns recognised in prior communications such as that of figure 3.12.28 Figure 3.14: Hercules fills a basic concept container for objects and actions by recognizing the subject matter of the discourse..30 Figure 3.15.1: The hyponym hierarchy of Socrates for premise A.31 Figure 3.15.2: The hyponym node hierarchy premise A joined by relationship to the hyponym node hierarchy premise B..31 Figure 3.15.3: The hyponym and node hierarchy of premise A and B..32 Figure 3.15.4: The hyponym and node hierarchy of premise A and B and C..32 Figure 3.16: Socrates ode Hierarchy of Wordnet data using relationships.33 Figure 3.17: Example script for using the Hercules ISA method for testing the node hierarchy of Socrates34 Figure 3.18: Socrates node Hierarchies and Mortal definition can be traced through node relationships.35 Figure 3.19: Script, data and methods for finding what mortal means for Socrates, or what anything means for anything if given the context..36 Figure 3.20: Hercules simulates an interesting and engaging manner in communications with others37 Figure 3.21: The subjects removed from a sentence create a frame..37 Figure 3.22: The hyponym hierarchies forming the abstracted concept with the definition metadata38 Figure 3.24: Shows the (is a) node relationships created by Hercules between the table elements of table 3.4...39 Figure 3.25: Shows the overall categorised and ranged concept in a reduced and understandable way..39 Figure 3.26: Illustrates the relationships created by Hercules using hyponym data and critical reasoning..40 Figure 3.27: The distinction made to the concept category where the concept becomes too abstract41

Figure 3.28: A frame and concept and abstraction within a given range.41 Figure 3.29: The hyponym data for Cleopatra...42 Figure 3.30: The hyponym data for Socrates..42 Figure 3.31: An analogy where first subject of comparable discourse is abstracted..43 Figure 3.32: Hyponym hierarchies provided by Wordnet for person, rock, and cat..43 Figure 3.33: Heuristic substitution and abstraction using the hyponym hierarchy of a particular word sense...44 Figure 3.34: Shows the comparison of hyponym hierarchies of Cleopatra, Socrates, and a Rock..44 Figure 3.35: Is the frame of the concept analogy of figure 3.31.45 Figure 3.36: Syntax structures of concepts and frames combined..45 Figure 3.37: Syntax for concepts and frames with category information included.45 Figure 3.38: Statement of fact provided by a person.46 Figure 3.39: Rock hyponym hierarchy with the object category of distinction46 Figure 3.40: The expanded concept frame to a table or array of data..46 Figure 3.40: Illustrates the distinction drawn from the user input of figure 3.38 will deactivate categories of the analogy and concept..47 Figure 3.41: The upper and lower limits of analogy in context with relationships and attributes.47 Figure 3.42: The Analogy Upper Limit48 Figure 3.43: The Analogy Lower Limit48

List of Tables Table 3.1: Concept score analysis Table18 Table 3.2: The binary signature of a concept signature of Table 3.119 Table 3.3: The hyponym hierarchies for Socrates is a man using frame * is a *.38 Table 3.4: Shows the 2 x 7 Matrix of concept combinations of table 3.3.39

Acknowledgements I would like to thank my supervisor Dr. Tony C. Smith for his help in guiding my studies and helping me to explain my project to others. Tony has inspired, encouraged and challenged my views, whilst providing me with guidance to assist in me explaining my research. Tonys optimism, and need I say at times devils advocacy, has made for me, an interesting philosophical journey into Artificial Intelligence research and design.

Section 1: Introduction This report describes a general exposition of the workings and theory behind the Hercules parser. The Hercules parser has been under development for 3 years now, and is still in the process of development. There are numerous features and functions that have been integrated into the parser to provide a basis for a computer to learn from and communicate with people. This section provides a brief introduction to the context, goals, motivation, and chapter overviews of this expositional report on the Hercules Parser. 1.1 Context Since the invention of the computer, there have been countless fascinations with the idea of Artificial Intelligence. The idea that a person can communicate with a computer and have the computer understand and respond has too numerous applications to describe. With a general view that having a computer understand and assist people with their lives will be beneficial for those concerned, I have created the Hercules parser to investigate how this may be achieved. This report exposits the steps, processes, and theory investigated by myself in providing such a system. 1.2 Exposition Goals The aim of this exposition is to describe the workings of the Hercules Parser, the theories underlying the Hercules Parser, and to describe how the parser can be used with Wordnet and other databases so that further research may be carried out using the Hercules Parser as the Platform to achieve conclusive scientific research and findings. Describe the compositional structures of the parser Describe the data structures of the parser Describe the execution of script operations by the parser Describe the Databases of the parser Describe the theory behind using wordnet hyponym hierarchies in concept design Describe the theory behind using frames with a parser to create concept objects using wordnet hyponym hierarchies Describe how critical reasoning can be used to supplement the node hierarchies of wordnet Describe how analogy may be formed from concepts derived from the Wordnet hyponym hierarchies Describe in general how the Hercules parser can assist in making a hypothesis surrounding the meanings of conversation where an algorithm can be applied to control program flow for interpretation of communications Describe future work that can flow on from the exposition of the Hercules Parser

1.3 Motivation The motivation in the creation of the Hercules parser began with the attempt to leverage the information stored within Wordnet so that a computer may talk to a person. The aim was to allow a person to ask questions to Hercules and have Hercules respond in an interesting way. Because of the large amount of information in Wordnet and the availability of the code in C++, Wordnet became the logical starting point for beginning investigation in to Artificial Intelligence. Because C++ was the default language in the code libraries of Wordnet, it was an attractive starting point from the perspective that processing and memory overhead would be reduced due to the nature of the C++ language; where raw power, flexibility, and direct hardware access may required. Due to constraints in time and complexities interfacing with .Net databases and libraries, managed class objects, XML, and windows forms have been integrated into the previously command line based application. The Wordnet 1.17 code has been altered significantly to incorporate the class objects of the Hercules parser. Further task specific class object based engines have been designed for handling core components and the functionality provided thereof. With the task of the construction of the Hercules parser prototype nearly complete, it is left that the relationships of data in communications can be explored to identify patterns used for intelligent communications between individuals, and apply them to form an artificial intelligence within a computer for the benefit of assisting a person. 1.4 Report Chapters Section 1: This section provides an overview of the report. Section 2: Provides a general background into the research documents that have contributed to the ideas and concepts that the Hercules parser is based upon. Some of the material has been considered in the construction of the parser so that the theories or findings of those articles may be explored with a functional parser and databases for a statistical repository. Section 3: Discusses and expands on the goals listed in section 1.6. The goals are not set out individually, but are interrelated and addressed in the subsections under each topic. Section 4: Discusses experimentation, results and analysis; however, since the Hercules parser has been designed to run the experiments, limited work has been carried out in experimentation. However, research will continue in the future once the prototype had been completed. Section 5: Identifies future work to be done in the areas of algorithms, parser enhancements, and the final goal of true Artificial Intelligence. Section 6: Discusses concluding remarks and observations surrounding the Hercules Parser and the exposition within this report.

Section 2: Background Systems and reference documents have assisted in the creation and support of the underlying concepts the Hercules Parser attempts to encompass and are listed below. 2.1 Other attempts Earlier attempts in designing an artificially intelligent machine have been numerous. Attempts include the CYC Project by Douglas Lenat, A.L.I.C.E. by Dr. Richard S. Wallace and Eliza by Joseph Weizenbaum. More recent attempts have been made in designing artificial intelligence such as Jabberwaky by Rollo Carpenter which had competed well in an attempt to pass the Turing test in competing for the Loebner prize. 2.2 A Conceptual Parser for atural Language A conceptual parser for natural language - by Roger C Shank and Lawrence G Tesler describes an operable automatic parser for natural language. It is a conceptual parser, concerned with determining the underlying meaning of the input utilizing a network of concepts explicating the beliefs inherent in a piece of discourse. 2.3 Conceptual Dependency and Montague Grammar: A step toward conciliation Conceptual Dependency and Montague Grammar: A step toward conciliation by Mark A. Jones and David S. Warren, contrasts and reconciles the CD theory of Schanks conceptual parser in section 2.2 with the logic system of Montague Grammar using a sorted hierarchy and typed lambda calculus. 2.4 Schank/Riesbeck vs. orman/Rumelhart: Whats the difference? Schank/Riesbeck vs. Norman/Rumelhart: Whats the difference? explores the fundamental differences between two sentence parsers and how keywords, frames and expectations are handled between the two. The paper focus is more specifically at the operational level but is thought provoking where similarities are shared with the Hercules Parser. 2.5 How a eural et Grows Symbols How a neural net grows symbols by James Franklin illustrates how clustering may be used in conjunction with a neural net for data reduction, and are ideal for AI implementations. 2.6 A hybrid Approach to Word Sense Disambiguation: eural Clustering with Class Labelling A hybrid approach to word sense disambiguation: Neural Clustering with class labelling by Steve Legrand and JRG Pulido combines a neural algorithm with the Wordnet lexical database to be able to semi-automatically label groups of items

10

clustered in a multi-branched hierarchy, illustrating the use of neural algorithms together with ontological knowledge in word sense disambiguation tasks. 2.7 A Generative Model for Semantic Role Labelling A Generative Model for Semantic Role Labelling by Cynthia Thompson, Roger Levy, and Christopher Manning use FrameNet sematic role and frame ontology for identifying semantic roles. To quote from it, the paper attempts the task of learning to automatically assign such roles. Identifying such roles and the relationships between them can in turn serve as support for inference about a sentences meaning, for antecedent resolution, or for other understanding or parsing tasks such as prepositional phrase attachment or word sense disambiguation. FrameNet corpus and apply it to the task of automatic semantic role and frame identification. This paper develops a generative model from which one can infer role labels, given sentence constituents and a word from that sentence that is a predicator, which takes semantic role arguments 2.8 Unsupervised Semantic Role Labelling Unsupervised Semantic Role Labelling by Robert Swier and Suzanne Stevenson: To quote from it they, present an unsupervised method for labelling the arguments of verbs with their semantic roles using an algorithm which makes initial unambiguous role assignments, and then iteratively updates the probability model on which future assignments are based.

11

Section 3: System Overview 3.1 The Hercules AI Parser The Hercules AI parser has been created to allow a person to converse with a computer. Figure 3.1 illustrates an overview of the Hercules AI Parser components and subsystems. Hercules uses basic concepts, critical reasoning and analogy to form a calculated hypothesis about what is being said. Hercules is pre-programmed with sufficient concepts and rules that allow meanings of conversation to be explored. Hercules is also a goal oriented parser, where Hercules is able to assist people with tasks that people wish to complete. The Hercules parser is divided into a number of components that assist in understanding communications and tasks.
WordNet 1.17 C++ Query Database

Query Database

Hercules Data Interface Frame Engine Query Engine

Memory Database

Concept Database

Hercules Parser
Critical Reasoning Database

Input / Output

Analogy Database

Figure 3.1: The Hercules AI Parser components and subsystems 3.2 Input / Output Hercules receives input from a person and responds to the person in an intelligent way. The communications between Hercules provide an experience that Hercules can learn from. Hercules uses Microsoft Windows Narrator to read Hercules output from a command prompt. Also Microsoft Windows Speech Recognition or a keyboard allows a user to provide text information to Hercules via the command prompt.

12

3.3 Wordnet 1.17 Searches Wordnet 1.17 Provides information to Hercules regarding: Ontologies of hyponymy (Is A relationship) Ontologies of meronymy (Has A - relationship) Word sense information including the definitions of those senses Part of speech information Synonyms

The Wordnet 1.17 C++ program code has been modified to run multiple searches to provide the information Hercules requires. Hercules can be modified to use any of the Wordnet searches to retrieve information from the Wordnet databases. Normally Wordnet runs a search on a single word and returns specific search data depending on the search type. Wordnet code libraries have been modified for Hercules to run five searches per word instead of one. The information normally outputted to the user for each separate search is collected in a customised data structure called the Hercules Data Interface. 3.4 The Hercules Data Interface Figure 3.2 illustrates the hierarchical relationships and flow of data between the Hercules Data Interface (HDI) and the data collected from Wordnet. The HDI is the container structure for all of the information retrieved from the Wordnet searches. When each search is run using the Wordnet libraries, custom modifications to the code populates the HDI with the Wordnet output data. Once the data has been collected for the words of the sentence using section 3.3, the data is attached directly to the words of the sentence as described in section 3.5. Wordnet

Hercules Data Interface

Hyponym Tree

Meronym Tree

Word Senses

Part of Speech

Synonyms

Figure 3.2: Relationships between the Hercules Data Interface (HDI) and the data collected from Wordnet 3.5 Sentence Structures and Base ode Hierarchies The default container objects for the parsing functions of Hercules use the sWord class objects. The sWord class object allows a number of linked-lists to be formed in node hierarchies. Figure 3.3 illustrates the relationships of the metadata to the sWord

13

node. Instead of having multiple objects of differing types, extensions to the class attributes are added as pointers to other data structures which then define the types. The presence of a particular pointer determines the parsing function that may be used. Parsing functions or methods are based upon set theory and predicate logic; the resulting formulas use attributes to identify the super and sub sets, and logical assertions. Node objects can then be parsed according to a given formula, where mathematical symbols are mapped to the processes to be carried out on data, including the relationships between data. As the sWord data structure can be used in many ways, a general overview is provided below. Figure 3.3 shows a general overview of the node pointer types that can be used to order the hierarchies. sWord-Data: sWord data resulting from searches and metadata sWord-Next: Link to next sWord node at same level sWord-Next-List: Link to next sWord list at same level sWord-Phrase: Link to next sWord Phrase-list at same level sWord-Next-Phrase-List: Link to next sWord list at the next node level Next-sWord Next-Lists Phrase-List Next Phrase-Lists Frame-Meta: Categories, Weights, Thresholds, [] Formulas Wordnet-Meta: Meronym Tree, Hyponym Tree, Definitions, Synonyms, Senses

sWord: Word-Use Word-Options Tenses Hyponym Meronym Senses POS Synonyms

Frame Metadata Wordnet Metadata

Figure 3.3: sWord class object, pointers, attributes and metadata relationships 3.51 Sentence Structures Created from User Input Figure 3.4 illustrates the sWord node and data hierarchies where Hercules receives text input from the user via a command prompt. This hierarchy is also used for any text or phrase that Hercules parses, including the text loaded from databases. 1. Input is received from the user 2. A first sWord class object node is created and contains the full text of the phrase. The words of the sentence in the char buffer are separated by white spaces in natural language. 3. Each word of the discourse of the first nodes char buffer will then be separated by Hercules into separate sWord class objects char buffers using the white spaces as the tokens or delimiters.

14

4. The first word of the new linked list of separated words is linked to the first node. 5. A search is run consecutively for each word in the list using section 3.3 to provide the data of section 3.4 6. The resultant search data of section 3.4 is transfixed to each word of the linked list after each search of section 3.3 Socrates is a man

Socrates

is

man

Hyponym Meronym Senses POS Synonyms

Hyponym Meronym Senses POS Synonyms

Hyponym Meronym Senses POS Synonyms

Hyponym Meronym Senses POS Synonyms

Figure 3.4: Sentence node hierarchies tokenized using white spaces populated with resultant HDI search data 3.52 The Frame Engine and ode Hierarchies Hercules comprises a frame engine which loads frames of text to the memory of the computer. Figure 3.5 illustrates the partial node hierarchies and metadata structures resulting from loading the frame tables of the database. The frame engine loads the data from a database, the frames of text and any associated Frame-Metadata for each frame are then available to compare against sentence information. 1. Hercules checks the Word-Sets Table of the main database to know which tables are to be treated as frame tables 2. A node hierarchy of sWords are created; first by the Table name, second by the full text frames as a phrase 3. Then full text frames are separated into a linked list of sWords as in section 3.51, except optionally without the Wordnet-Metadata, but the FrameMetadata instead 4. The Frame-Metadata remains attached at a higher level sWord node with the formula to be run if conditions dictate.

15

Hercules Databases: HASA, ISA, Ability, Hercules-Memory, Concepts, Analogy, Critical-Reasoning, Word-Sets, Queries... Check Word-Sets Frame Table: ISA All * are * Reference:0, Category:20, Order:0, weight:0, Formula: SETISA 2 is 4

HASA

Has a * a *

ISA Frame Metadata All * are * 0 20 0 0 0 Formula SETISA 2 is 4

All

are

Ability

Figure 3.5: Frame sWord object node hierarchies with frame metadata including formula data structures 3.6 Query Engine Hercules comprises a query engine which processes text based scripts into a chain of query objects. The query objects then allow Hercules to test conditions and carry out operations against the sentence data in a particular sequence. The query returns the success if the conditions are met. Figure 3.6 illustrates the query creation process, where the query information is read from the Pattern-Query database table. 1. The Query is read from the Pattern-Query Table of the database 16

2. The scripted operators are converted into a bit-category to signify the operations to be carried out on particular data 3. The scripted conditions are set in the query class objects to signify the particular data to be tested for 4. Each section of the query creates a query object to be stacked for execution of the operations against the conditions being tested for

Hercules Databases: HASA, ISA, Ability, Hercules-Memory, Concepts, Analogy, CriticalReasoning, Word-Sets, Pattern-Query, ... Database Table: Pattern-Query: wordcount 2, word 1 is solid, word 1 is instance, word 2 is solid, word 2 maybe noun, Set word 2 guess, Set word 2 noun Metadata: Rank:0, Weight:0, Threshold:0, Link:0, Category:20, Active:0

Check for Operators

Standard Query Operator Actions Is Like Break Not Starts Go-to Maybe End s Not-Maybe Contains Link Last Set If Finished And Then ID Or

Frame Query Operator Actions Frame-Item Formula IsA HasA InA

Set Conditions

Query Conditions Noun Guess Verb Solid Adjective Frame Adverb Present Instance Tense

Create Query Stack

Word 1 Is : Solid

Word 2 Is : Instance

Word 2 Maybe : Noun

Word 2 Set : Guess

Word 2 Set : Noun

Figure 3.6: Query creation process and commands

17

Once the Query is created and added to the stack, the query can be executed against the sentence. A success is returned if the whole query executes, Hercules then continues executing the remaining queries in the stack. Standard logical, set, or mathematical calculations are then performed as processes of the query. As Hercules is able to perform all manner of processes on almost any type of information, it is required that pattern recognition using statistics be investigated in order to determine how best to use the English language to communicate. Hercules is able to use the patterns existing in the Wordnet domain categories as the basis for concept recognition in text. The domain categories can assist in identifying a concept, whilst the sense of a word or words provides the definitive value of any expression. This is because any expression of a word by a person in its sense usually has a determinate meaning, even if the determinate meaning is subjective between individuals. Hence the actual meaning of Enjoy is actually provided once the correct sense is discovered. Therefore in order to discover the correct meaning, queries and statistics are used to discover concepts and the sense of a word. 3.61 Frame Queries and Statistics Where a frame indicating the possibility of a concept is discovered in a sentence, a statistic can be generated to assist with understanding the context of the subjects and therefore the sense of the words. As frames have already been categorised in Hercules to a particular concept; the frames can help identify probability of the word senses and subjects of the sentence. In the table 3.1 below it is illustrated by the column that frame types of Persons, Tense, Movement, Ability, and Accomplish are some of those concept categories available in Hercules. Concept categories provide a very rough basis for understanding a sentence using statistics. The discourse of Socrates found ingested an antidote to save his life provides the following table 3.1 when 5 concept categories using non specific discourse concept identifiers are used against the whole sentence. Later on with further development in pattern recognition, surrounding sentences can also assist to identify the context of the words. However, for the purposes of illustration, a smaller context is explored in table 3.1. Socrates found and ingested an antidote to save his life Personal Socrates Cleopatra his her # 1 0 1 0 Tense * was * * is * * will * *ed # 0 0 0 1 Movement * ingest* * to * * *ed * * went * # 1 1 1 0 Ability * *ed to * * can * to * * ate * # 1 0 1 0 Accomplish * made * * did * * can found * to # 0 0 0 1

Table 3.1: Concept score analysis Table Table 3.1 displays where each time a match is discovered under a particular concept category indicator, a score is generated. The resulting score of concepts show a count where the scores above are Personal: 2 Tense: 1 Movement: 3 Ability: 2 Accomplish: 1. This statistic of frequency of a concept possibly being present can be represented in a flow chart. The boundaries of the concept are then established in figure 3.7 as a section of a wave.

18

5 Concept Score 4 3 2 1 0 Personal Tense Movement Concept Category Ability Accomplish Score

Figure 3.7: Flow chart for concept wave / fragment section boundaries As figure 3.7 provides the boundaries of the possible concepts of a particular category, it is possible to compare concepts to each other; or measure the concept based on probability where subjects are used in successful communication. Successful communication is established later through re-communication learned information back to a person and then establishing relationships between the data. The boundaries or wave of a concept can represent a fragment of a concept, where categories are included or excluded from the query run against the discourse. Table 3.1 also allow a frame signature to be established. The frame signature of table 3.1 is not the score, but rather the binary representation of the presence of the frames it has found in a particular category. There may be many identifiers within a section of discourse of what may indicate a concept; however, the presence of this identifier in a category allows a binary concept signature to be formed and used in conjunction with the concept wave or concept fragment boundaries of figure 3.7. Also the scores of Table 3.1 may weight a signature where signatures appear to be the same in a binary representation, but differ in score, and therefore weight. This extra score allows a pattern to further distinguish concepts in order to appropriately weight and distinguish the overall concept during pattern recognition in related discourse. 1 0 1 0 0 0 0 1 1 1 1 0 1 0 1 0 0 0 0 1

Table 3.2: The binary signature of a concept signature of Table 3.1 Table 3.2 illustrates that for each concept of Table 3.1 that is present, a bit is set to 1 for that category. If there is no presence of any indicator of a particular concept category the cell for that table is set to 0. Where there exist many concepts within discourse, that are tested for using a query; the signatures may be stacked atop each other, and re-ordered by category, score, and presence, using an algorithm for sorting the categories. The algorithm is discussed later in section 5 for future work to be done. Also it would be interesting to use a neural network to identify the patterns present in communications where a score and binary signature can be identified. 19

Where repeated patterns are identified in communications, and the senses of those words making up the pattern are discovered, a Hidden Markov model will be able to be used to identify the concept category, semantic role, or other delineable class of word or type or category. As repeated patterns will indicate a probability, those patterns must be tabled; theoretically, into a Bayesian network where the probability can be deduced from the statistical relationships of the words in the discourse. Hercules provides a platform for flexible algorithms, identifiable patterns, tables of probabilities of expected relationships, concept categories, weighting scores, concept fragments and signatures; which can theoretically assist Hercules to identify in this example that a person is moving to perform an ability, which will then help more-so in determining the senses of the subjects of the discourse. In order to accomplish such a flexible platform for exploring the meanings of communications, Hercules uses formulas and procedures that can be executed when a particular frame is identified within the discourse. 3.62 Frame Queries with Formulas Where more complex data operations are required, a frame allows a formula to be executed. The formula allows the sWord node hierarchies to be traversed, and query operations to be carried out on the nodes returned. The formulas can be constructed to test any attribute or node within any database or the memory of Hercules. It is logical to use well known and established formula notations such as those found in set theory and predicate logic. Mathematicians and linguists are familiar with the symbols and what they represent. Executing a procedure of Hercules by parsing a script that follows a common notation for grouping data simplifies the creation and explanation, and implementation of established. Other common formulas, which are actually processes, have been created to access specific data. The current processes for accessing the nodes reside in the formula section of the metadata. Formulas and processes are closely related in Hercules because in reality the processes represent a return or manipulate of a subset of data in the node hierarchy. Formulas can also be attached to a frame so that correct algorithms can balance the weight of the data where a frame is matched. This means that where a statistic is set at a particular level for a context, that statistic is demoted or promoted in weight based upon that formula. Otherwise, given a different context, the same statistic is to be treated differently according to the differing context. Formulas for returning a member of a subset of nodes in Hercules are ISA, HASA, INA and MEANS. The formulas will also be extended to include running SQL commands to retrieve and manipulate the nodes and associated metadata. Figure 3.8 illustrates the processes carried out by Hercules when the discourse All men is recognised using a query. 1. 2. 3. 4. 5. A user provides the words All men The sentence list is created as in section 3.51 Wordnet is searched as in section 3.3 The HDI is updated as in section 3.4 The HDI data is transfixed to the sentence list as indicated in sections 3.4 and 3.51 (a relationship created by assigning the HDI sWord pointers to the

20

6.

7.

8.

9.

sentence sWord Wordnet metadata structures as described in section 3.5) and #Defined bit flag information is available for the Word-Use-Options for the Part of Speech setting bits 3 and 4 to indicate an adjective and adverb respectively Determiners and others information, such as tenses, are identified within the discourse so that a determinate use of a word may be attributed to a word of the discourse (e.g. All men all is the determiner, and is set as an Instance Object, indicated by running a frame query for instance objects using Instance frames as in section 3.52, and setting on bits 5 and 10 in the Word-Use) The query stack is then executed as described in section 3.6 (this example uses the query example of section 3.6 to illustrate how the operations and conditions are executed and tested respectively). The query roughly translates in lay to if word 1 is a determiner or instance and the next word has the option of being a noun, then set the next word after the determiner to a noun As the query stack is executed, a linked list of query objects are executed and tested against the discourse provided by the user. Because the discourse has been populated with data from Wordnet and other databases, the query allows the data to be tested depending on which operations and data members have been specified within the query objects during their construction at runtime. An appendix can be provided in future work explaining the defined operations and data members operated on, including how and why Hercules uses them. In this example Hercules matches the bit defined data within the sWords data structures to test and set the conditions of other data members according to the rule put forward in the script and returns a success for the chain of query objects of a specific query in the stack if all conditions are tested successfully

21

Input Sentence: All men sWord 1 All Hyponym Meronym Senses POS Synonyms Hyponym Meronym Senses POS Synonyms Word-Use: Solid(5), Instance(10) Use Options: Adj(3), Adv(4) Hyponym Meronym Senses POS Synonyms Word-Use: None(0) Use Options: Noun(1) sWord 2 Men

Create Sentence List

Search Wordnet

Update HDI

Link HDI sWord Data to Sentence

ID determiners from Frames e.g. All * = Word-Use(5, 10) Execute Query Stack sWord 1 IS Solid(5) sWord 1 IS Instance (10) All:1 U:5,10 O:3,4 1:U & bit(10) 1000010000 & 1000000000 = 1000000000 sWord 2 MAYBE Noun(1)

sWord 2 SET Guess(9)

sWord 2 SET Noun(1)

All:1 U:5,10 O:3,4 1:U & bit(5)

Men:2 U:0 O:1 1:0 & bit(1)

Men:2 U:0 O:1 2:U |= bit(9)

Men:2 U:0 + bit(9) O:1 2:U |= bit(1)

1000010000 & 0000010000 = 0000010000

0000000001 & 0000000001 = 0000000001

0000000000 | 0100000000 = 0100000000

0100000000 | 0000000001 = 0100000001

Return Success or Fail

Execute next Query in the Stack

Figure 3.8: sWord data structure updates and construction using the query stack execution processes for testing and setting bit defined data attributes

22

3.7 The AI Mind of Hercules As Hercules is being designed to be the platform of an artificial mind, there must be some formation of basic concepts in order to understand and respond to a person intelligently. It is anticipated that Hercules will utilize Neural Network style learning for pattern recognition using techniques such as clustering as described by James Franklin in How a neural net grows symbols to assist in large data volumes to be recognised and processed. However, I would theorise that concept fragments and their troughs and peaks will assist in identifying and distinguishing a concept in conjunction with a bit-mask filter; instead of a symbol, or use a waves and symbols instead of just symbols themselves so that the neural net can be understood at a schema level. Also A Hybrid Approach to Word Sense Disambiguation: Neural Clustering with class labelling illustrates a Self Organized Map which may be used with concept category re-organization using concept signatures and pattern techniques of section 3.61 with clustering to allow discerned categories to assist in words sense disambiguation by reorganising categories of stacked signatures to identify real patterns of concepts. 3.8 Forming basic concepts Forming basic concepts allows for communications to be understood. Hercules has some basic concepts pre-programmed so that a hypothesis can be formed about what is being said, even if the hypothesis is incorrect. The basic concepts are formed around a theoretical maxim of for every action there is and equal and opposite reaction. This requires a subjective view of metaphysics, and a consideration for the reaction within a persons mind when witnessing and event. The beginning forming concepts in Hercules requires a binary view of the physical world. For example; Object X has Attribute Y. Object X at position A moved to position B. Object X Y A D T Y B

Object Z

Figure 3.8: Objects, Attributes, Actions, Distance, Time, Position, Actor, and Witness We are able to heuristically recognise these subtleties in our environment. From the example of objects, actions, attributes, time and position, we are able to determine core concept of objects (X), actions (D/T), location (A or B), distances (D = B-A), and times (T). Metaphysical concepts are established to build from and form a simple schema for the node hierarchies. As more is known about Object Z, it is attributed to Object Z; such as were Object Z is called Hercules, and Attribute Y indicates Hercules is a computer; and so on for any additional attribute. So the node Hierarchies are similar to Wordnets categories of ISA and HASA; and we can understand and externally build upon Wordnets

23

databases to include Object Z ISA computer, Object Z HASA Attribute Y, Objects Zs Attribute Y ISA name, Objects Zs name IS Hercules. Actor and witness form a binary view to observations in the real world. In example, Actor Object Z with attribute Y witnessed Actor Object X with attribute Y move from position A to position B. An example of the Binary perspective can be applied to real world situations. A Hercules object Z witnessed a computer object X move: Hercules witnessed the computer move to a new subnet of the network.

Sally Y

I like eating Chicken (A), Salmon(B), and Turkey (C).

Hercules Y

Figure 3.9: Witnessing events in conversation assist in experience, learning and expectation Actor and witness also allow a binary perspective to distinguish communications. Actor Object Z witnessed Actor Object X communicate A, B and C. Hercules witnessed Sally say I like eating Chicken, Salmon, and Turkey. Concepts are derived from an analogous abstraction of a sentence. Consider dissecting the statement above. We can make many assumptions about the statement. The assumptions we make are based upon what we expect or have experienced. People innately expect what they have experienced. The persons mind will draw a conclusion about the statement simply by reading or witnessing it. This can be applied to the learning of Hercules where patterns are recognised within discourse. In witnessing the statement above, concepts are in actual fact required to supplement an understanding or hypothesis about what is being said and correctly identifying the word sense witnessed in the statement made by the other. Pattern recognition can occur by witnessing a statement, then making a generalization about the structure of the sentence. Where generalizations are made, such as about the semantic role of a words sense or about the domain or concept category; the pattern can then be used to predict that where the repeated sentence structures are recognized, similar concepts underpin the subjects. To truly recognize the concepts underpinning a sentence would require some experience. The initial experience of the computer is pre-coded to a basic level; it has so far been my experience of what may indicate a concept within a sentence; and representing that using a familiar frame of English as the reference that achieves this initial recognition. Fundamental frames of concepts have been pre-written for Hercules and are used to explore the meanings of communications as described in this exposition. These fundamental and core concepts in Hercules allow Hercules to explore the meanings of subjects in a logical and analogous way using the logical relationships established in Wordnet and by collecting information by communication back to the user.

24

3.9 Abstraction of Concepts Considering Sallys statement again from figure 3.9 we can determine concepts from the subjects. Wordnet Hyponym Hierarchies can be used to create abstract concepts. A simple approach can be taken with the discourse. Starting with I (Sally), then I like, then I like eating, I like eating A, B, and C. The concepts and the subjects are completely related. An abstraction of the concept information can be made presuming the senses are correctly identified for the statement and shown in figure 3.10 which displays the Wordnet Hyponym hierarchies for the words of the statement in figure 3.9.

=0>I =1> not you =2> person, individual, someone, somebody, mortal, soul =3> organism, being =0>like =1> see, consider, reckon, view, regard =2> think, believe, consider, conceive =3> evaluate, pass judgment, judge =4> think, cogitate, cerebrate =0>eat =1> eat =2> consume, ingest, take in, take, have =1> consume, ingest, take in, take, have

=0>[A: Chicken, B: Salmon, C: Turkey] =(1-5)> =6> animal, animate being, beast, brute, creature, fauna =7> organism, being =8> living thing, animate thing =9> object, physical object =10> physical entity =11> entity Figure 3.10: Wordnet Hyponym hierarchies of the statement in figure 3.9 Taking nodes of figure 3.10 at a lower level from the initial nodes of the sentence in figure 3.9 we can chose a pattern which may or may not be useful; for the purposes of this example a concept can be abstracted to lower nodes and placed in a sequence for a database to store the abstract concept as: person:2 [sally] evaluate:3 [enjoys] ingest:2 [eating] animal(s):6 [A, B and C] The abstract concept is formed by Hercules from the Hyponym hierarchies and can then be stored in a concept database. The concept may or may not include ranges of lower nodes e.g. [Person:1-2] [evaluate:1-4] [ingest1-2] [animals:6-11], and can then 25

be limited later on when forming analogies. The concept may range from any of the category domains, down from the highest nodes to the bottom of the hierarchies. The syntax and order of the words form the frame for the new concept and the frame can then be used to compare it with other statements. It is useful to consider the comparison may be done using the techniques of section 3.61. The comparison can be made against statements having similar Hyponym hierarchies. Concept Frames can be derived from any statement; though an understanding of the purpose of the statement is later determined though experience. Figure 3.11 shows where: 1. Hercules witnesses the statement made by Sally 2. The frames are checked using a query as in section 3.62, the query may or may not check any or all of the frames Hercules has in memory, though in this example has I like * and * and * (and others), but links those frames to create a larger frame 3. The larger frame combination is then stored into the frame database with other metadata, such as the subjects and other node metadata, for use later in recognizing speech patterns and expected subjects

Sally Y

I like eating Chicken (A), Salmon(B), and Turkey (C).

Hercules Y

I like eating Chicken, Salmon, and Turkey I like * * and *

Check Frames

I like

* and *

Create new pattern

Store in Database

Figure 3.12: Hercules is able to link existing frames to create a new pattern based on user input and store for later reference Patterns can by recognised after experiencing communications where repeated patterns in communications point to valid statements. Valid statements can then be used to communicate back to a person or identify correct speech in communications.

26

3.10 Concepts, purpose, reason and goals The reasons for any concepts require a purpose because without an understanding of purpose the concept is meaningless. As such, there is no use for a meaningless concept without purpose; therefore purpose provides meaning. A meaningful explanation of an event for Hercules and others requires basic reasons to supplement the core-concepts of Hercules. The most basic of concepts are those for understanding the needs of a living organism, allowing a purpose to be speculated by Hercules. Even if the purpose is misinterpreted, there is opportunity for correction later via further communications, and it may also be that many purposes are fulfilled by one action or communication. The person being communicated with should provide a correction or aberration in their communication if purported or perplexed by a miscommunication from Hercules. If no correction is provided, the communication and concepts appear valid but may be challenged later. Purpose and meaning also requires Hercules apply itself to assist in the goals of a person. The assistance of persons with their goals allows for learning and for meaningful exchanges of information by experience. The needs of a living organism form the lowest nodes of the reason hierarchy, and is must be assumed that any goal of a person must fit at a higher level to achieve the end purpose. Therefore a goal can be listed as in an ordered hierarchy of process and procedure. Another achievement hierarchy would be to accomplishing a goal with a person, which should be rewarding to those concerned; including for Hercules, and implemented by simulating a rewarding state of identifiable successes in its environments. This may accord to social interaction where the needs of others must be weighed to achieve a purpose. However, Hercules may advance to this at some later stage, where at the current stage of development Hercules will carry out any function requested given the means, and based on fact e.g. A person may command Hercules to add 2 and 2, eject the CD from the CD Rom, tell a joke, or answer a question. Hercules may be able to learn that when someone says I need to put the CD in or OK Herc, CD!, then the person says you dumb computer and manually pushes the eject, that next time Hercules hears something about a CD, he will ask the user if they wish Hercules to open the tray. But of course this is open speculation, but quite possible and not too far fetched. 3.11 Concepts forming reason of a Living organism The fundamental concepts for understanding goals lie in the 7 traits of living organisms as discussed briefly in section 3.10. Without living organisms, the universe would be objects or energy confined in movement by physics. Sentient living organisms use a higher mental process to achieve their goals. Understanding the goals of a sentient organism allows purpose to be formed and therefore allowing a valid reasoning. Valid reasoning is the explanation of actions and events in achieving a purpose. Goals, purpose and reasoning allow Hercules to explore the meanings of actions. The explorations of the meaning of actions allow expectations to be formed on oneself and the environment. Nutrition, Respiration, Movement, Excretion, Growth, Reproduction, Sensitivity are the core motivations of every living being,

27

therefore everything understood by Hercules will relate to one or more of theses motivations; or why else would we do anything but to satisfy our needs, even out of instinct or subconscious actions. It is the goals of the living organism that form the processes leading to the 7 traits, such goal which may be abundant in variety and colourful in nature. Take the male peacock, with feathers and plumage, expanding his tail to attract a mate for reproduction. This does not explain much but it in a node hierarchy, it could be seen as Expand Tail -> Attract Mate-> Reproduce. 3.12 Learning through abstract concepts An abstraction of concepts is able to be formed from discourse as described in section 3.9. Repeated patterns in discourse reinforce valid communication structures. Valid communication structures are able to be observed. Figure 3.13 shows where Sally was to say a statement to Hercules similar to that of figure 3.12, that extra weight would be added to the frame remembered earlier where similarities exist. Using that frame later is a matter of discovery, such as where Eggs, Bacon, Toast, Salmon, chicken and turkey may be classified as food sally likes, or that if sally is eating eggs, she likely is eating bacon and toast too. It is a matter of social interaction and communication that will ultimately indicate the real probabilities of what a particular person is trying to communicate given that a context can be established using patterns.

Sally Y

I like eating Eggs, Bacon, and Toast too!

Hercules Y

I like eating Eggs, Bacon, and Toast too!

Check Frames

I like

* and *

Add weight to repeated patterns

Hmm I better remember this one, Ive seen it before!!

Hercules Y

Figure 3.13: Hercules will add weight to patterns recognised in prior communications such as that of figure 3.12 Because the concepts categories and expected subjects of the sentence are determined by probability, a Bayesian Network of both abstract concepts and frame patterns are

28

constructed where ambiguity exists. Where the node hierarchies have set relationships, a table is constructed, and the probability of the frame fitting the subjects is considered where patterns are repeated and further weight is added or shifted according to the algorithms initiated by the script or query. Weight is added to possible abstract concepts making them more probable when later applying script formulae to control communications, such as where concept patterns discernable attributes and a formula can be constructed to appropriately weight the frame in reference. When Hercules uses the communication patterns successfully they again become more probable and are recorded the more probable again. Unsuccessful communications patterns become demoted or redundant and the weights and thresholds can be adjusted accordingly using the formula section of the frame or frames in reference. The concept fragment, wave or signature of section 3.61 can distinguish a pattern to adjust weight if it is discerned necessary in recognition. The contexts and goals of the communications are also relevant to pattern recognition and weighting. The goals of the communications need to be established, and will assist in providing a context and an actual understanding of the communications. Actual understanding will allow a correct interpretation and exploration of the communications may continue where given the means. 3.13 Underlying conceptual schemas and schema limitations The underlying conceptual schemas form a generic means to build upon the coreconcepts of movement from A to B, Object X at position A or B, Object X has attribute Y, Object Xs reason for moving was Z; as described in section 3.8. Also, reasons are attached as attributes to an object formed from the 7 basic needs of living organisms described in section 3.11. Limitations exist within the schema, and are actualised in the restrictions of general physics and observations or descriptions of movement and actions. Further limitations underlying concepts are attached by attribute when identified; such as when discovered in conversations or as matter of fact, such as being told the height of a building or parsing and reading the colour of the sky in an encyclopaedia. Core-conceptual schemas are built using Object, Action, Attributes, Tense and Reason. Concept subject objects result from the observations. Formulae and algorithms can later manipulate the resulting statistics regarding the information recorded. Figure 3.14 gives an overview of the basic concept container that hold enough subject attributes that can be linked together if required. An object or action is not necessarily described by more that 3 consecutive adjectives or adverbs in everyday conversation. Also, a tense is either presumed or apparent, and a reason for the communication is also presumed or apparent. Because each object is related to an action, the object or action may be linked to other objects or actions. It is then the relationships in the node hierarchies that define the context or other interrelations. The relationships are complex and directly related to the purpose and the use of these object action containers are left open for exploration of future implementations using scripts, formulae and algorithms; but are none the less required for subject containers where data has been extracted where a pattern has been recognized.

29

Object X Y A D T

Object Z Y B

Concept Object: 3 Noun: Ball Adjective 1: Red Adjective 2: Adjective 3: Tense: Present Reason: Action Link: 4

Concept Action: 4 Verb: Rolling Adverb 1: Quickly Adverb 2: Adverb 3: Tense: Reason: Object Link: 3

A red ball rolling quickly from A to B Run Queries and Frame Formulas

Create Concept Subjects

Figure 3.14: Hercules fills a basic concept container for objects and actions by recognizing the subject matter of the discourse 3.14 Schemas based upon CD theory Schank created a parser based on Concept Dependency Theory described in A conceptual parser for natural language, which identifies the relationships of concepts with each other and how a parser functions using concepts. The relationships of concepts are described by elements, where the elements are derived from rules common to all languages and concepts. Similar subjects sharing concepts are able to be interchanged similar to a semantic role in a sentence, and similar to my description of concepts having subjects in frames. In Schanks Parser, the concepts and the relationship elements form a generic concept; however, frames of English language can also form an ordered concept schema. The graphical representation of concepts is no different to an ordered frame of English where words such as to, the, and, will, has etc can illustrate similar relationships when attributed to a particular category of concept. Schank attempted to simplify the concept creation process and his work may be more relevant where used as an underlying conceptual schema to build upon. Analogous concepts are able to be formed using plain English with a context. In example, movement: the * was *. The context is provided as a contextual predicator, and the components may be explained using CD Theory. There may be a limitation to CD theory where subjects can not be distinguished from each other in a broad concept. This requires the attributes of subjects to be the distinguishing factor over the basic concept, and using a node hierarchy and unique identifier for that node and attribute per category to achieve this through re-ordering relationships. None the less, Schanks Parser and CD theory provide more than reasonable proof of the importance of semantic roles in concept identification. 30

3.15 Critical Reasoning Critical reasoning is a necessary part of sentient reasoning where logical steps, relationships and assertions must be considered. Critical Reasoning forms the basis of making a logical assumption or providing reasonable expectation. Inductive and deductive reasoning is used as a model to form a hierarchy of expected statements within Hercules. The core concepts within Hercules frame category database tables provide the foundations for critical reasoning in the parser. 1. A category is required to be determined for the subject of the frame. 2. The category then provides the context of the subject in order to distinguish the sense of a word or phrase. 3. In order to reason, the first premise is the first distinguished subject, and a second distinguished subject indicates a relationship with the first subject. 4. The relationship of the first and second subject is recorded in the Critical Reasoning database of Hercules. The core-concepts components of Object, Action, Attribute, Tense, and Reason are assigned and attributed with the new relationships. All assertions are assumed logical and true by Hercules except where limitations can be applied to conflicts in truth discovered in communication. 3.16 Hierarchy for reasoning Hercules has methods for testing a hierarchy. Methods such as IsA() and HasA() traverse the hierarchy of a particular database. Figure 3.15.1 is the hyponym node hierarchy of Socrates is a man (premise A) =1>Socrates =2> man Figure 3.15.1: The hyponym hierarchy of Socrates for premise A In Figure 3.15.2 the statement All men are mortal creates a relationship with premise A shown in figure 3.15.1. The need to test the premise occurs only when challenged. =1>Socrates =2> man + (new relationship formed) =1>(Men, man) =2>mortal Figure 3.15.2: The hyponym node hierarchy premise A joined by relationship to the hyponym node hierarchy premise B

31

When a premise is challenged, to test if Socrates is mortal Hercules will call a method by formula to traverses the new hierarchy. Figure 3.15.3 shows the hyponym and attribute hierarchy of Socrates is a man, All men are mortal. Within the node hierarchy there are 3 types of attribute at the same level for each node which are ISA, HASA, and MEANS. Depending on what information is being requested will depend which relationship is created and what information is returned. The ISA() method is capable of finding out whether Socrates is Mortal. ISA: Socrates

ISA: Man

HASA: =>head =>body

MEANS: Male Person

ISA: Mortal Organism

HASA: Adjective

MEANS: Subject to death

Figure 3.15.3: The hyponym and node hierarchy of premise A and B Once the node hierarchy is established by relationships stored in a database external to Wordnet any information is able to be added to a particular level. Say that premise C was that Socrates is a red-head; figure 3.15.4 show this relationship must be created at the correct level, and in this case probably at level 2 of the hyponym hierarchy as it is a new distinguished attribute of Socrates. If the new node was placed under mortal, a mistake may be made where all mortals are believed to be read-heads. IsA (Socrates, Mortal) =1>[Socrates] =2>man =3>[mortal] =4>Red-Head =2>Red-Head =3>[Has red hair]

Figure 3.15.4: The hyponym and node hierarchy of premise A and B and C The response of the test is yes

32

3.17 Database Structures for reasoning Wordnet provides Hyponym trees (ISA), Meronym trees (HASA), and sense definitions (MEANS). The Wordnet information provides the template for the creation of basic Objects and concepts. A Hyponym tree is the ontology of category or domain for a word sense. A Meronym tree is the ontology of composition of a word sense. A premise of a category can be tested against the Hyponym tree or a premise of composition can be tested against the Meronym tree. Hercules uses Wordnet to test the premises of what a thing can comprise or what kind something is. When considering the structures for reasoning, the attributes are related directly to IsA() and HasA() functions. The data must be tagged and attributed correctly within the database. Also the data must be added to the database with the correct attributes in the correct hierarchy. Figure 3.16 shows a hierarchical representation of the collection of nodes for Socrates and the attachment of the new attribute Mortal is maintained at the correct level with its metadata. Where mortal is a new attribute, the node hierarchies for mortal are also maintained, though unused until a method may require the information of mortal.

=1>Socrates =1> Sense 1: ISA man, adult male =1> male, male person =2> person, individual, someone, somebody, mortal, soul =3> organism, being =4> living thing, animate thing =5> object, physical object =6> physical entity =7> entity =1>New Attribute + [Metadata] : man + Isa: Mortal =1> Sense 1: MEANS: Definition: (1437) man, adult male -- (an adult person who is male (as opposed to a woman); "there were two women and six men on the bus") =1> Sense 1: HASA man, adult male HAS PART(1): adult male body, man's body HAS PART(2): beard, face fungus, whiskers HAS PART(3): mustache, moustache =1>male, male person HAS PART(1): male body HAS PART(2): male reproductive system Figure 3.16: Socrates ode Hierarchy of Wordnet data using relationships

33

Hercules requires a separate database from Wordnet is created for Critical Reasoning. The database is a collection of premises and their relationships referenced back to the original Wordnet hierarchies. The premises are then able to be referenced as a hierarchy of nodes where new relationships are attributed to each premise. 3.18 Script Formula examples for Critical Reasoning A premise and attribute have been previously added within the Socrates Node Hierarchy as shown in section 3.16. The test of the reason the premise pertains to can be called via a formula attached to a frame. The context and category of the following frame is assumed as ISA for this example. Figure 3.17 shows the query in red. 1. Discourse is provided to Hercules of Is Socrates mortal? 2. The processes of Section 3.62 are carried out with the data below the script in red 3. Hercules is designed so that the script simplifies the entry of data, rather than a person entering in large amounts of complex data, all that is required to be entered is the script in red to handle any question asking about is Socrates mortal? or is Cleopatra female? or any combination of is [something] [something] or is * *? Script: Wordcount 3, Frame ISA id 1 is present Discourse: is Socrates mortal? Frame: Is * * Frame Unique ID:1 Frame Category: ISA Frame Formula: ISA 3 a 2 Method Called: BOOL IsA( [sWord:Char:Socrates], [sWord:Char:Mortal] ); Method Action: Call Parse Method on Socrates node hierarchy for ISA: mortal Method Return: TRUE (when attribute/node [Mortal] is located) Method Response: printf("Hercules can see %s is a %s", [sWord:Char:Socrates], [sWord:Char:Mortal] ); Hercules Response Output:Hercules can see Socrates is a mortal Figure 3.17: Example script for using the Hercules ISA method for testing the node hierarchy of Socrates Providing the context is still that of Socrates, the question is really What does mortal mean for Socrates? Hercules uses scripts and searches wordnet and the external databases for discourse metadata to provide the information below in the following heirarchy. Figure 3.18 illustrates that in a node hierarchy, a relationship can be created that supports Hercules in retrieving data stored by reason provided in conversation.

34

=1>Socrates

External Database Term

=1> Sense 1: ISA Wordnet Hyponym tree man, adult male =1> male, male person =2> [] ISA: =1>Mortal External Database Relationship to:

MEANS=1> Sense 1: Defintion: Mortal Wordnet Definition (3) mortal -- (subject to death; "mortal beings")

Figure 3.18: Socrates node Hierarchies and Mortal definition can be traced through node relationships A script, frame and formula can be used to specifically retrieve the meaning or definition about the subject of Socrates being mortal. Providing that the correct meaning of Mortal is attributed to Socrates; we are able to use a simple script described in red in figure 3.19 to receive a definitive answer to the question, What does mortal mean? Please note that this script can be used to find out what anything means for anthing where What does * mean in * is used. This allows us to ask Hercules what does an eagle mean in golf? or what does think mean in person?

Script: Wordcount 4, Frame MEA id 1 is present, or, Wordcount 6, Frame MEA id 2 is present Discourse: what does mortal mean? Context Subject Predicator: Socrates Frame1: What does * mean Frame Unique ID: 1 Frame Category: MEANS Frame SUBJECT: Socrates Frame Formula: MEANS INA 3 in SUBJECT or Frame2: What does * mean in * Frame Unique ID: 2 Frame Category: MEANS Frame SUBJECT: SPECIFIED Frame Formula: MEANS INA 3 in 6 Method(s) Called: 1. sWord InA( [sWord:Char:Socrates], [sWord:Char:Mortal] ) and 2. sWord Means([sWord:Mortal:Char:Definition])

35

Method(s) Action: Call Parse Method 1 on Socrates node hierarchy for return of object sWord: ISA: mortal, then call method 2 to return the mortal definition for output Method 1 Return: Return sWord Node (when attribute/node [Mortal] is located) Method 2 Return: Return definition char array for use in output or response Method Response: printf("%s means %s", [sWord:Char:Mortal], [sWord:Mortal:Definition:Char:Mortal] ); Hercules Response Output: Mortal means subject to death Figure 3.19: Script, data and methods for finding what mortal means for Socrates, or what anything means for anything if given the context In figure 3.19 Hercules response conforms to a format set by the formula attached to frame * means *. It should however be noted that any number of scripted operations may be performed on the Hercules response data providing a method is constructed, and the location of the data is known. Any operation may be performed on that data for grouping in sets, comparisons and performing statistical and mathematical calculations or analysis. Any operation on the data may be performed via the flexibility provided using scripts. Initially the scripts are and have been hand written. The writing of scripts is partly automated in the GUI of Hercules so a person can easily write the scripts without the need to know a computer programming language. The automated writing of scripts is automated using hard-coded methods; however, once sufficient methods are constructed, access to the methods is then provided to Hercules. This automation of scripts via methods ultimately allows Hercules to write its own scripts. A separate scripting Database for learning will assist Hercules to re-write its own scripts based on pattern recognition, acquired knowledge via conversation and corrections to fact. Current scripts for Hercules are located in the Pattern-Query Table of the databases. 3.19 Fact and Truth Corrections of the Databases A correction may be made to information via learning; such as where the wrong sense definition is communicated to a person. The mistake in fact requires the user to inform Hercules of the error, or a correction will be asked for if a discrepancy is encountered. The correction is asked for in a manner appropriate to simulate how a person may discover the actual truth of the circumstances. Any manner of simulation will disguise the actual methods used. Hercules will personalise the communications using familiar and accepted terminology. For example, alternate and random phrases will request the correct information possibly using enthusiastic sounding statements and humour to assist in engaging the user to provide correct information. Figure 3.20 illustrates how a statement such as How interesting! or hehe! may emotionally engage the user to continue discussions. The words in the response of Hercules can provoke an emotional state in the user where the appearance of emotion is perceived in Hercules statements. It would be interesting to measure the responses by individuals to different statements made in this manner.

36

How interesting! Do you really like Bacon and Eggs? I like dogs, hehe!

Hercules Y

Figure 3.20: Hercules simulates an interesting and engaging manner in communications with others 3.20 Setting the Database Data Method SetIsA() is called to set data for * is a *. This allows for a database external to Wordnet to create a relationship between word 1 and 4 of the statement. The script for the frame to set the data is Wordcount 4, Frame SETISA is PRESENT with a formula attached as SETISA 1 is 4. Please note that any SQL statement may also set the data; however XML SQL Table Data-adapters are used by the methods. Methods such as SetHasA() and SetMeans() are constructed in a similar fashion ensuring the correct information is updated. 3.21 Abstractions of real concepts for Analogy For an analogy to occur, Hercules must generalise the concepts. The concepts and subjects are required to be abstracted to a greater degree. Parsing the Hyponym node hierarchies for category information provides the basis for an abstraction of the concepts and subjects in question. To make an abstraction of concepts the discourse is generalised for all senses and hierarchies. Core concept frames allow for this to happen. Take the statement Socrates is a man, all men are mortal, therefore Socrates is mortal. Figure 3.21 shows the frame that underlies the sentence subjects.

* is a *, all * are *, therefore * is * Figure 3.21: The subjects removed from a sentence create a frame It is clear by examination of the frame in figure 3.21 that the subjects have been removed. Smaller frames already within Hercules are matched to the syntax of the statement. Please note that the Method SetIsA() as described in section 3.20 is called to set data for * is a * using scripts similarly to section 3.18. This allows for a database external to Wordnet to create a relationship between word 1 and 4 of the statement above. Relationships are created through critical reasoning and pattern recognition; however subjects are abstracted using the Wordnet hyponym Hierarchies. The level of abstraction of a concept depends on the possible abstractions made from the nodes of the hyponym hierarchy.

37

Wordnet provides us with the following information:

Socrates =1> man man, adult male =1> male, male person =2> person, individual, someone, somebody, mortal, soul =3> organism, being =4> living thing, animate thing =5> object, physical object =6> physical entity =7> entity Mortal =1> Adjective: Definition Subject to Death Figure 3.22: The hyponym hierarchies forming the abstracted concept with the definition metadata A table of possible concepts are created by factorizing the domain category nodes, where each node of the category of each sense is considered in probability for the least ambiguity. This means each word of each possible sense is factorized into an array and each lower node belongs to a new row of the column. Table 3.3 illustrates the elements the table can comprise where each columns category elements are multiplied by the elements in the next column providing a 2 x 7 matrix of concept combinations as shown in figure 3.23. Please note that the senses have not yet been factorized with Table 3.3 as it is presumed the correct sense of the words have been identified for this example using the frame * is a * and the categories are correct for the sense. An actual implementation would be much more complex than this small example; requiring factors of table 3.3 to include the senses of ambiguous discourse and the probabilities and adjustments for correction based on past, present and expected input of communications. =1>Socrates =2>Man =1>Male =2>Person =3>Organism =4>Living Thing =5>Object =6>Physical Entity =7>Entity

Table 3.3: The hyponym hierarchies for Socrates is a man using frame * is a * Table 3.4 shows table 3.3 expanded as a 2 x 7 matrix of concept combinations. For larger frames, the concept combinations will be exponential. The table can be reduced to simplified form where the associated node levels are represented by the ranges of the nodes in that category.

38

Socrates: 0 Socrates: 0 Socrates: 0 Socrates: 0 Socrates: 0 Socrates: 0 Socrates: 0 man: 1 man: 1 man: 1 man: 1 man: 1 man: 1 man: 1

Male: 1 Person: 2 Organism: 3 Living Thing: 4 Object: 5 Physical Entity: 6 Entity: 7 Male: 1 Person: 2 Organism: 3 Living Thing: 4 Object: 5 Physical Entity: 6 Entity: 7

Table 3.4: Shows the 2 x 7 Matrix of concept combinations of table 3.3 [Socrates: 0] (is a) [male person: 1] [Socrates: 0] (is a) [person: 2] [Socrates: 0] (is a) [Organism: 3] [Socrates: 0] (is a) [Living thing: 4] [Socrates: 0] (is a) [Object: 5] [Socrates: 0] (is a) [Physical entity: 6] [Socrates: 0] (is a) [entity: 7] [man: 1] (is a) [male person: 1] [man: 1] (is a) [person: 2] [man: 1] (is a) [Organism, being: 3] [man: 1] (is a) [Living thing, animate thing: 4] [man: 1] (is a) [Object: 5] [man: 1] (is a) [Physical entity: 6] [man: 1] (is a) [entity: 7] Figure 3.24: Shows the (is a) node relationships created by Hercules between the table elements of table 3.4 The concept ranges can be reduced as shown in figure 3.25 where unique concept category hierarchies can be recognised. It is likely that a binary representation of the remaining hierarchy form the second node down in each subject is used to abstract the concept, depending on the relationship. Otherwise the next node down from the subject will be the first point of abstraction of the sense. The range of what the concept will cover in analogy is limited later by experience. [Socrates, sense1, isa, 1-7] [Man, sense 1, isa, 1-7] [0010010101, 10101010, 1-7] [0100101010, 10101010, 1-7]

Figure 3.25: Shows the overall categorised and ranged concept in a reduced and understandable way

39

Figure 3.26 shows the relationships between the data of figure 3.22. The concept categories are derived from Wordnet and have had relationships established by Hercules in prior conversations where premise A of Socrates is a man and premise B All men are mortal has formed the relationships. Abstractions can then be made once the relationships have been established, supporting the creation of concept abstractions by Hercules.

Socrates is a man all men are mortal (* is a *) (all * are *) =0>Socrates (is a) =0> man, adult male (is a) =0>Mortal =1>man =1> male, male person =1>Subject to death =2> person, individual, someone, somebody, mortal, soul =3> organism, being =4> living thing, animate thing =5> object, physical object =6> physical entity =7> entity

Figure 3.26: Illustrates the relationships created by Hercules using hyponym data and critical reasoning

Considering that all statements are considered logical and true; Figure 3.26 shows the first and second premise accepted and grouped with the hyponym data of Socrates and man with the attribute definition of Mortal. However, abstracted concepts of a premise may not apply in all circumstances where the abstraction becomes too vague. A test of the validity of an abstracted concept is discovered in the request or provision of information in communications. Given the opportunity, Hercules may say to a person, a male person is subject to death, yes? or Organism is subject to death, yes? and so on. However, where an abstraction of a concept may be too general, the analogy may not apply. A distinction is drawn by Hercules during conversation where a conflict is noticed. The conclusion to the premises A and B are drawn in therefore Socrates is Mortal, which is taken as true and correct; and it is in the abstractions that a distinction may apply. A person may state an object is not mortal, but the truth of premise A must be maintained for Socrates whilst able to be applied to other circumstances. The analogy must be distinguished by category in as shown by figure 3.27 where we understand that not all objects are subject to death. Hercules does not yet have the experience to know that information until proposed in another statement.

40

* (is a) [male person: 1] [subject to death:1] * (is a) [person: 2] [subject to death:1] * (is a) [Organism: 3] [subject to death:1] * (is a) [Living thing: 4] [subject to death:1] * (is a) [Object: 5] [subject to death:1] Object becomes too abstract to be certain * (is a) is a [Physical entity: 6] [subject to death:1] * (is a) is a [entity: 7] [subject to death:1] Figure 3.27: The distinction made to the concept category where the concept becomes too abstract The abstraction can be limited above at Object: 5 subject to death when considering a likely statement by a real person that indicates Not all objects are subject to death or [object: rock] is not subject to death. At this point of the conversation with the provision of the new information that conflicts with the concept, a distinguished category may be considered that not all objects are living things subject to death. Whilst communications do not diverge from the premises, there is no need to consider any limitation if for example a person states My [object: cat] died. The statement of the cat dying, displays no divergence or limitations to the established abstractions or premises in categories 1-7 above. It is only when a conflict arrises that a new premise can be established or limited. 3.22 Limitations on Abstract Concepts It is noted here that the combination of abstractions allow Hercules to predict and recognise what other concepts or subjects may be forthcoming. Experience of communications allows a predicted conclusion to be drawn. The clause of therefore illustrates one indication of a predicate conclusion drawn in conversation. For the example using figure 3.28, please assume the framed concept of ingestion, therefore, and died has been firmly established in the core-concepts using predicate logic, set theory, tenses and the 7 traits of living organisms. The frame of a concept is shown without the subjects, and below the frame is a collection of concepts that have been abstracted to a lower level in the concept category node hierarchy for each subject. As experiences in communication will form and limit the perception of concepts and the conclusion drawn; consider the therefore statement of Socrates ingested poison therefore he died. Consider now the probability of Hercules understanding the statement Cleopatra ingested poison therefore she died. after hearing about Socrates ingesting poison. Figure 3.28 shows that where a concept has been abstracted the semantic role of the subject is preserved, and allows Hercules to be able to more easily compare similar sentences for established patterns, therefore establishing a probability for an expected subject sense or type.

* ingested * therefore * died [person] ingested [poison] therefore [person] died

Figure 3.28: A frame and concept and abstraction within a given range

41

Given that the structure of the sentence is similar for both statements about Socrates and Cleopatra; the subjects can be abstracted towards concepts in common which determine both the expected type of data to fit the frame, and the limit on the abstraction of the concept. The Wordnet Hyponym data for Woman is:

=0> Cleopatra =0> woman, adult female =1> Woman =1> female, female person =2> person, individual, someone, somebody, mortal, soul =3> organism, being =4> living thing, animate thing =5> object, physical object =6> physical entity =7> entity Figure 3.29: The hyponym data for Cleopatra

=0>Socrates =1>man

(is a) =0> man, adult male =1> male, male person =2> person, individual, someone, somebody, mortal, soul =3> organism, being =4> living thing, animate thing =5> object, physical object =6> physical entity =7> entity

Figure 3.30: The hyponym data for Socrates When comparing the hyponym data of Cleopatra in figure 3.29 with that of Socrates in figure 3.30, it can be deduced that the closest point of abstraction for man and woman is at position 2 of the man and woman Hyponym Data. This position is related to the shared category of Person, where person also shares the remaining hyponym categories of a particular word sense. The position is not the indicator; rather the existence in a particular category is the indicator. In this example [Person] is the common attribute and shared from [Person] to [Entity] and may indicate a shared sense if sufficiently identified by a detailed category node hierarchy. To revisit the statements of Socrates ingested poison therefore he died and Cleopatra ingested poison therefore she died, the assumption made by Hercules from the abstraction above provides both an analogy shown in figure 3.31 and also an expectation and probability that when matching the frame * who ingest poison will die to another sentence, it is likely that the first word of the frame will be of the type [persons].

42

[persons] who ingest poison will die.

Figure 3.31: An analogy where first subject of comparable discourse is abstracted Hercules will now expect a [person] will be described in discourse characterized by common attributes and the Hyponym hierarchy. Hercules is also able to apply the resulting analogy to other discourse. Other factors can later be explored through experience and patterns to recognise what the other probabilities of any other types being present may be. 3.23 Forming Analogies In order to form an analogy, we must first abstract the concepts of a particular kind as described in section 3.22. In conversation, we are able to heuristically use abstractions of a kind as synonyms in language, and still remember what the synonym pertains. Consider the following types of things in figure 3.32; a person, rock and cat. Each type of things has a hierarchy of the kind of thing.

Person [person: 0] [organism: 1] [living thing: 2] [object: 3]

Rock [rock: 0] [natural object: 1] [whole, unit: 2] [object: 3]

Cat [cat: 0] [feline: 1] [carnivore: 2] [placental: 3] [mammal: 4] [vertebrate: 5] [chordate: 6] [animal: 7] [organism: 8] [living thing: 9] [object: 10]

Figure 3.32: Hyponym hierarchies provided by Wordnet for person, rock, and cat In figure 3.32 we can see a type of person is both a kind of [organism: 1] and a kind of [living thing: 2]. A type of cat is a kind of [mammal: 4], and a kind of [animal: 7]. We can also heuristically substitute a kind of thing in conversation and still understand what type the kind relates to. I can talk about the cat as a feline, then the feline an animal. Now consider the animal was fuzzy and her name was sally. You can guess that I am still talking about the cat (unless you are tired). Now consider the frame of that sentence in figure 3.33.

43

The * was fuzzy and her name was sally The [cat: 0] was fuzzy and her name was sally The [feline: 1] was fuzzy and her name was sally The [animal: 7] was fuzzy and her name was sally

Figure 3.33: Heuristic substitution and abstraction using the hyponym hierarchy of a particular word sense Analogy is made through the abstraction of concepts. The point of abstraction in a frame of discourse is similar to a particular semantic role of the subject in the discourse. Presuming people converse rationally, the word provided by a person at the point of abstraction should make sense with the remaining discourse. The Hyponym hierarchy of a new subject at the point of abstraction is presumed to be valid if provided by a person, and can then be abstracted. The analogy of figure 3.31 resulting from an earlier example of abstraction of concepts involved Socrates and Cleopatra. The analogy was that: [persons] who ingest poison will die. The broader the abstraction of concepts; the more subjective becomes the analogy. The depth of analogy is most sensible at the closest of the more established premises. That is to say we are more able to make sense of an analogy where the categories are more specific. It is easier to distinguish [Socrates: 0] over [organism: 2] or [living thing: 3] even if both other terms 2 and 3 in theory refer to Socrates. Those words sharing the same categories will qualify the analogy as valid. Those subjects sharing the same categories can be recognised as the same kind, at particular levels of abstraction using the node hierarchies. If no similarities exist in a category of subject, there is no relationship with the analogy category. Therefore the analogy can not be applied. Figure 3.4 shows the comparison of the hyponym hierarchies where Cleopatra and Socrates share a common hierarchy at person, where a rock at first appears to be unrelated in category until the node of object is compared. =0> Cleopatra =0> Woman =0>Socrates =0> Man =1> Woman =1> Female =1>Man =1> Male =2> Person =2> Person =3>Organism =3>Organism =4>Living =4>Living =5>Object =5>Object =0>Rock =1>Natural Object =2>Whole =3>Object

Figure 3.34: Shows the comparison of hyponym hierarchies of Cleopatra, Socrates, and a Rock A rock is neither any attribute in the hierarchy above [Person: 0], therefore the analogy of person can not apply to the rock. This is because the kinds of concept categories shared by Cleopatra and Socrates and their hyponym hierarchies are too dissimilar from the hyponym hierarchy of the rock at the level of [Person: 0]. By

44

taking the abstraction of the analogy further in comparison to [Object: 3], the Rock may mistakenly fit the analogy by sharing a kind in common. Therefore, in this example, the analogy may be incorrect. The analogy may be corrected by a statement from a person or reference to fact. 3.24 Corrections and limitations on Analogy Below is an example of how Hercules would limit the scope of the analogy. The analogy templates remain Active (True) until challenged. Consider the frame in figure 3.35 below which is the frame of the concept analogy of figure 3.31 where the concept category has been replaced with a wild card to substitute for any word and sense. * who ingest poison will die. Figure 3.35: Is the frame of the concept analogy of figure 3.31 Hercules made the higher level analogy resulting in this frame in the earlier example shown by figure 3.31 using an abstraction of concepts for Socrates and Cleopatra. Hercules has this information in its memory. Considering that there exists attributes and data-structures in Hercules; similar to the already reasoned Socrates is a man, all men are mortal example; we may broadly understand the syntax structures of the figure 3.36 using the frame and analogy. Plain text syntax combining concepts and frames can provide a sufficiently understandable format to be stored within a database to represent frame data. Even if the data stored in the database is not in the exact format as required, when an administrator of Hercules is interpreting the concepts and data, they should be presented to the administrator in a sufficiently understandable format such as in figure 3.36 or 3.37. * who ingest poison [[will die] : subject to death] Figure 3.36: Syntax structures of concepts and frames combined The frame of figure 3.36 and any attribute, such as the definition of a word, may be further abstracted using the category information as well ranges as described in section 3.21 and by figure 3.25; shown by figure 3.37.

[[Category]: 0-5][who ingest poison [[will die: 1[Attribute: Definition: subject to death]]].

Figure 3.37: Syntax for concepts and frames with category information included Having a general concept outline as described by figure 3.37, the concept can now be limited by another statement of fact. A statement is provided by a person or from parsing an encyclopaedia. The statement is:

45

A rock is not subject to death. Figure 3.38: Statement of fact provided by a person This statement of figure 3.38 can now be used to distinguish the limits of the analogy concept of figure 3.37. To limit the analogy, consider a distinction in the analogy of figure 3.37 wherein the Hyponym hierarchy for Rock shown in figure 3.39 allows us to determine the category and kind of distinction. The distinction must be drawn at the most recognisable level which is at node 3 of figure 3.39 and [object: 3] of figure 3.40. There is a category and kind in common between the rock and the person.

=0> rock, stone =1> natural object =2> whole, unit =3> object, physical object =4> physical entity =5> entity

Category of distinction in common

Figure 3.39: Rock hyponym hierarchy with the object category of distinction

[person: 0] [Organism: 1] [Living thing: 2] [Object: 3] [Physical entity: 4] [entity: 5]

[who ingest poison will die: 1] (Active) [who ingest poison will die: 1] (Active) [who ingest poison will die: 1] (Active) [who ingest poison will die: 1] (Active) [who ingest poison will die: 1] (Active) [who ingest poison will die: 1] (Active)

Analogy is limited

Figure 3.40: The expanded concept frame to a table or array of data This is the limit of where a distinction can not be drawn for differing subjects of the analogy sharing a category in common. Where the distinction can not be drawn, the analogy hierarchy then on is not viable and must be Inactive. The entire analogy must not be removed, as the analogy is correct depending on the subjects. It is only the limitation through distinction that is the mechanism for applying the analogy and determining an understanding of the statement. By traversing down the Hyponym hierarchy forming the abstractions of the analogy, we are able to determine the actual category verbose, or via synonyms to determine that category. The point of distinction then limits the scope of the analogy when discovered. Given that only one sense of the word Rock is of the particular category intended by the person, and this known; with a distinguished word-sense and unique Hyponym hierarchy, Hercules can draw a distinction limited to living things: 2 in the concept table shown in figure 3.40. The analogy is now limited to the correct level of abstraction for it to remain valid.

46

[Person: 0] [who ingest poison will die: 1] (Active)(Weight: x%) [Organism: 1] [who ingest poison will die: 1] (Active) (Weight: x%) [Living thing: 2] [who ingest poison will die: 1] (Active) (Weight: x%) [Object: 3] [who ingest poison will die: 1] (Inactive) no longer viable analogy [Physical entity: 4] [subject to death: 1] (Inactive) no longer viable analogy [Entity: 5] [who ingest poison will die: 1] (Inactive) no longer viable analogy Figure 3.40: Illustrates the distinction drawn from the user input of figure 3.38 will deactivate categories of the analogy and concept Figure 3.41 below illustrates the overall picture of the upper and lower limits of analogy in context with relationships and attributes. ISA relationships are shared in common between Socrates and Cleopatra allowing the upper limit to be established. The lower limit is distinguished when facts limit the level before the limit allows extraneous attributes to conflict, indicating a different kind of concept below the limit.

IsA =0>Socrates =1>Man IS: Mortal

IsA =0>Cleopatra =1>Woman IS: Mortal NOT: Mortal

Analogy Upper Limit =2> person, individual, someone, somebody, mortal, soul =3> organism, being =4> living thing, animate thing Analogy Lower Limit

Analogy Frame is:

=0>Rock =1>Object

[ [[Person(s)] to [Living Organisms] except [Object]] ingest Poison will die ]

Figure 3.41: The upper and lower limits of analogy in context with relationships and attributes

47

Ideally the analogy frames upper limit of 3.41 can be illustrated by figure 3.42, and can be determined by repeated patterns recognised in sentence discourse which then establish an analogy abstraction. Given that subject X and subject Y are of Type I, any element E not shared by X or Y above I establishes the upper limit of the analogy below E. =X>Socrates =E1>Man =I>Person =Y>Cleopatra =E2>Woman =I>Person

Analogy Upper Limit

=I>Person =O>Organism =S> Figure 3.42: The Analogy Upper Limit Ideally the analogy frames lower limit of 3.41 can be illustrated by figure 3.42, and can be determined by using the Wordnet hyponym hierarchies (HH) in making a comparison of new subject Y with an established analogy abstraction. Given that subject X accepts element E at all levels, and subject Y Refutes E at a common intersection I, the lower limit of the analogy is established above I, and the analogy remains valid in the section indicated by the pass. HH1: =X2> person =X3> organism, being =X4> living thing =X5> I: Object =X6> physical entity =X7> entity IS: Mortal E: NOT: Mortal

HH2: =Y0>Rock =Y1>I: Object

Analogy Lower Limit

Figure 3.43: The Analogy Lower Limit 3.25 Truth and Weight in analogy In order to determine an understanding of the discourse, a weight must be added to the analogy data. The weight assists in providing a determination through probability. The weight is increased as patterns are identified as being repeated. This confirms valid communication structures even if the meaning is unknown. The fact a person has provided the sentence, assumes there must be valid logic behind the pattern. The probability then provides the context once a meaning can be extracted via experience, statistics, and the frame context signatures described in section 3.61. The truth weightings are attached as metadata to the frame as circumstances dictate.

48

Other factors that assist with the weighting of the frame are also included in the database tables; however, the formulas that utilise the statistics have not yet been designed. The following frame table information can be utilised in any script or algorithm. A more detailed explanation and use will be explained in future work, however the table information provides a framework for expansion of the capabilities of the parser. Frame-Data: contains the Text Frame or Analogy Frame Reference: allows a pointer memory address to be stored in the database Category: is the bit defined category of the context or domain Order: is the order of the frame in the linked list Weight: is the weight attributed to the frame Transform: is the level of category that the frame transforms to a new category Threshold: is the limit to where the frame is active and relevant Relationship: is a flexible attribute with no specific definition, used as required Concepts: provides bit defined categories of the underlying and interrelated concepts in the concept database, which may fit to the frame Concept-Mask: assists as a filter to incoming concepts to the frame Tense: assists in identifying bit defined Tense categories, and the tense related concepts Tense-Mask: assists as a filter to incoming tense concepts to the frame Extensions: acts as a bit defined placeholder for expansion of the frame table information Extension-Mask: assists as a filter for the bit defined extensions Data-link: provides a link to external resources of data relevant to the frame Reversion: provides tracking for earlier versions of the current frame that the current frame has been transformed from Exclusions: provides a list of exceptions to the current frame being activated for a particular category Links: Allows the frame to be linked to another Formula: is the formula or script attached to the frame

49

Section 4: Experimentation, Results and Analysis As Hercules is in the prototype stages, most experiments, analysis and results will be carried out after future work. Hercules has been designed to run experiments on patterns of discourse, and store that data in a database. The results of the experiments will then supplement a context for the discourse to be understood in the context provided by the user. It is anticipated that a neural network style learning using the scores, signatures, concept fragments and wave analysis will supplement the understanding. Because of the broad nature and flexibility Hercules can provide it would be outside the scope of this report to examine specific formulae and the results that may be found; except to say that Hercules is capable of having user defined functions and symbols mapped to software #defined values for flexibility in executing hard-coded methods to support the experiments envisaged.

50

Section 5: Future Work 5.1 Algorithm Design As described earlier in section 3.61 Frame Queries and Statistics, A hybrid Approach to Word Sense Disambiguation: Neural Clustering with Class Labelling section 2.6 may be used in conjunction with A Generative Model for Semantic Role Labelling section 2.7 where Role labelling is supplemented by speech patterns identified by Frame Queries and Statistics of section 3.61 for the abstraction of concepts in section 3.9 to assist in the understanding of shared communications and goals such as in section 3.10. 5.2 Hercules Parser Enhancements Hercules must be augmented to handle many of the logic functions and elements that are used in Set Theory, and Predicate Logic; this will allow formal logic algorithms to be used along side mathematical equations with subsets of data, to control the interpretations and response of the Hercules Parser. The graphical user interface must be enhanced to assist with streamlining manual enhancements. Frame and Table metadata uses must be further defined and used along side the script functions and operations which will assist in reaching the goals of true artificial intelligence. 5.3 Reaching the goal of True Artificial Intelligence Ultimately, the Hercules Parser will likely construct its own scripts automatically, to deal with new communications and concepts. The accuracy of the hypothesis Hercules forms and replies to, with regard to the communications, may be the true test of the intelligence of the machine.

51

Section 6: Concluding Remarks The Hercules parser has been created with so much flexibility in mind that it is difficult to discuss the entire program and all functions it is capable of. I have aimed this exposition at the most interesting and important components of the parser used with Wordnet. The Hercules parser can provide a simple and easy way to design, test, and implement Artificial Intelligence algorithms, formulae and scripts using a graphical interface, without the need for others to have a complex knowledge of computer programming to do so.

52