Professional Documents
Culture Documents
DEFINITIONS
Enrico Sciubba
University of Rome "La Sapienza," Italy
Keywords: Artificial Intelligence; Expert Systems, Knowledge-Based Systems Contents
Contents
1. Introduction: Engineering Design, Knowledge, and Artificial Intelligence
2. What is Artificial Intelligence?
3. Definitions of Concepts and Terms
4. Relational Versus Deterministic Programming
5. Possible Versus Existing Applications of AI to Thermal Systems
6. Logical Systems
7. Semantic Networks
8. Fuzzy Sets
9. Neural Networks
10. Casual Versus Mechanical Learning: "Memory"
11. Search Methods
12. Handling of Constraints
13. Qualitative and Approximate Reasoning: Belief
Related Chapters
Glossary
Bibliography
Biographical Sketch
Summary
This article is an introduction to the field of artificial intelligence (AI). After a brief discussion
of the characteristics that make AI useful for engineering applications, a concise definition of
terms and concepts is given. The presentation style has been tailored to provide readers with a
general introduction to AI topics, without burdening them with excessive formalism. Since the
goal of this topic is to describe engineering applications to thermal design, emphasis on the
applicative side has been stressed in this introductory article.
1. Introduction: Engineering Design, Knowledge, and Artificial Intelligence
There is a consensus today that engineering design is a highly structured, interdisciplinary,
creative process, and it is invariably the result of a close cooperation between several
individuals (who formally or informally constitute the design team). Each persons diverse
expertise and specialized skills must be blended together, and result in an engineering project
(consisting of drawings, calculations, and all necessary technical documents) about a new
solution to the problem that prompted the team into action. The product of a design activity is
always an original answer to a technical, economic, social, or marketing problem. Naturally,
past experience (both general and specific) plays an important role in the formulation of the
solution, and this explains the intrinsic similarities between gas turbine plants, diesel engines,
Therefore, engineers should consider that a design activity, besides its quantitative side, has
an equally important qualitative side. In addition to performing engineering calculations, it is
necessary to be able to communicate to third parties not only the results of these calculations,
but also the original problem definition and the method that was used to reach a solution.
Finally, to increase the chance of future design improvements, it is more important to leave
written records of why and how the design goal was achieved starting from certain premises,
than to give an exact description of the numerical calculations which led to the definition of
certain design parameters.
In this topic, we will show how to employ artificial intelligence techniques to this purpose.
Moreover, we will describe how these techniques can be used to express in qualitative terms
the logical steps of an engineering activity, to formalize and systematize a large body of
knowledge, and to assist engineers by offering them logical guidance and support in their
design endeavors.
2. What is Artificial Intelligence?
Artificial Intelligence (for a more precise definition see Section 3) is a cumulative
denomination for a large body of techniques that have two general common traits: they are
computer methods, and they try to reproduce non-quantitative human thought processes. For
design applications, there is a third common trait: problems handled by AI-based techniques
are usually ill structured; that is, they are difficult or impossible to tackle with pre-determined
solution models.
The applications we will deal with in this article are a relatively small subset of general AI
techniques: knowledge based systems, also called expert systems. An expert system (ES) can
be roughly described as a specialized AI application aimed at the resolution of a single and
well-identified class of problems. Some of the major benefits offered by ESs are as follows:
been built and fielded by a company to assist its design engineers in the
choice, design, and technical specifications of a certain component, all
future designs will be carried out in a much shorter time, and with a less
intensive use of both high-level human resources and computational
hardware.
They can make knowledge more widely available, and help overcome
shortages of expertise. So, if an ES is developed by the research and
development division of a company to overview the adaptive control of a
line of production, it can be applied in all of that companys factoriesas
long as the same processes remain in usewithout needing to dedicate
highly qualified human resources to the proper communication of the
technology.
Knowledge stored in an ES is not lost when experts are no longer
available. This is, for instance, the case of a numerical process simulator
that is developed and implemented in a machine language that then
becomes obsolete. If a language-independent ES has been employed in the
development of the simulator, then far fewer high-level human resources
need to be employed to re-implement the code in a different language or
under a different operating system.
An ES stores knowledge it can gather from experts in some field, even if
they are not computer specialists (and even if they are not computer
users!). For example, when prospecting for new raw resources, it is often
indispensable to gather non-specific knowledge about the nature of the
area to be scouted, or about the type and history of possible ores or flows
that may have surfaced in the past. When performing this "information
gathering" in remote areas, an ES assistant that prompts, collects, and
critically analyzes answers by natives can drastically reduce both the time
needed to make an educated decision, and the amount of resources
invested in the prospecting activities.
An ES can perform tasks that would be very difficult to perform with more
conventional software: thus, it becomes possible to implement features
which were impossible or very cumbersome with earlier systems. A case in
point is the handling of approximate knowledge that cannot be hard-coded
otherwise. For example, while designing a heat exchangers network (see
below, and article AI in Process Design), it is at times convenient to relax
the rule "no heat flux across the pinch," and reformulate it as "some small
measure of heat flux across the pinch is allowed, provided the hot or cold
utility is conveniently reduced by this measure." This construct would be
impossible to translate into any rigidly framed type of computer
instruction, but it can be easily managed by an ES that treats the wordconcepts "some" and "conveniently" in a relative way, just as we mean it
when we verbally formulate the relaxed rule.
Other practical advantages associated with the development of a knowledge-based system are:
rapid prototyping
explicit knowledge encoding
possibility of dynamic and efficient further developments
ease of alteration
explanations capabilities for development and validation
All of these benefits come at a price: high requirements in computer resources, and low speed
of computation. The amount of physical memory space required by a knowledge base is very
high, because qualitative knowledge is not easily amenable to a simple numeric (binary)
representation. Moreover, since qualitative rules are implemented in machine language by
complex combinations of elementary logical operations, and the number of operations per
clock tick remains constant for a given hardware, the actual execution time of a rule is much
higher than that of a numerical operation. Furthermore, developing and installing a large ES is
a high added-value operation, and therefore a very expensive enterprise: the provisional
returns should always be carefully scrutinized before launching an ES project. Two main
factors that can make knowledge based systems profitable are:
a. the specialized work load could beor becomehigher than that which
can be reasonably performed by the available experts, or some of these
experts might not be available in the foreseeable future; and
b. the existingor foreseenworking conditions could lead to too high a
demand on decision-making engineers; this, in turn, may result in what
goes under the name of cognitive overloading. When cognitive overload
sets in, increasing the number of human experts dedicated to the problem
is not a proper solution, because communication among experts may
deteriorate (too many information bits of very different kinds must be
communicated to an increasing number of people).
Before proceeding any further, it is useful to present here our confutation of some of the more
common misconceptions about AI in general and ESs in particular.
a. It is not true that ESs are suitable for every type of design problem. For
instance, a very large class of structural, thermal, and fluid dynamic
problems can be tackled very effectively with "exact" (and fully
deterministic) algorithmic techniques, and rather poorly with an ES. On
the other hand, it is not correct to describe an ES as "an alternative to
conventional computer programming": in the majority of practical design
cases, the fields of application display a substantial overlap.
b. It is not true that AI users must be conversant with AI programming
languages. Neither is it true that creators of ESs must be familiar with
high-level languages like PROLOG, LISP, and the like: actually, often all
it takes to develop an effective ES is a sound knowledge of the principles
of predicate logic, complemented by specific domain knowledge.
c. However sophisticated and knowledgeable in their field of expertise, ES
users are logically, functionally, and mentally distinct from knowledge
engineers. Similarly, a domain expert is a substantially different
professional figure to a knowledge engineer (see Sections 3.15 and 3.16
below).
d. In turn, not every design or process engineer, or technical operator, is a
domain expert: in reality, there is a remarkable scarcity of reliable and
actually knowledgeable domain experts.
e. On the other hand, an ES should not be tested "against the domain
expert." That is, its performance ought not be compared with that of the
human expert in a similar situation.
In common language, we attribute the quality of "intelligence" both to individuals who must
think because they do not know much, and to those who know much, and therefore can afford
to think less. In the terms of AI, this can be rephrased as follows: we tend to attach the
attribute "intelligent" both to individuals who possess a small amount of domain knowledge,
and must therefore exercise a large quantity of inference to draw their conclusions, and to
individuals who possess a large amount of domain knowledge, and can therefore reach the
same conclusions by performing a rather limited inferential activity. Notice that in both
definitions we have implicitly assumed that an intelligent reaction is neither spontaneous nor
necessarily dictated by the available data. This point is important for two reasons. First, it
makes a clear distinction between potential intelligence, intended as a "memorization of
collected knowledge," and dynamic intelligence, which is at once the act and the result of
"reflecting" upon the collected knowledge. Second, it reminds us that the concept of
"thinking" reported in this topic refers to human thinking modes somehow hard- or soft-coded
into a machine. A one year old human (or, for that matter, a one month old cat!) can "think"
("infer") and "learn" ("increase the amount of knowledge it has command of") much faster
and more effectively than todays most advanced AI programs. On the other hand, AI methods
produce codes that perform, with almost incredible reliability and speed, cognitive
(qualitative) tasks that mimic very closely some traits of human thinking patterns. If we agree
with the idea that potential and dynamic intelligence are operationally equivalent (in the sense
that they can reach the same results), then clearly there is no limit to the "IQ" of an AI code:
to increase the IQ just requires storing more (properly constructed and connected) knowledge
in its database. In this sense, computers are intelligent, even by todays standards.
3. Definitions of Concepts and Terms
3.1 Artificial Intelligence (AI)
There is no universally accepted definition of AI. We shall adopt the following one:
AI is that part of computer science that investigates symbolic, non-algorithmic reasoning
processes, and the representation of symbolic knowledge for use in machine inference.
This is a modern and complete definition, which contains more than one point of interest for
following this topic. Firstly, it stresses the symbolic character of the object AI. Though its
meaning is obvious, its implications are not: AIs field of interest does not contain numbers
and operations on them, but logical objects and their mutual relationships. Secondly, it
introduces the concept of non-algorithmic reasoning, which is usually very difficult to grasp
for an engineer: a solution to a problem is not obtained by a predefined series of operations,
but rather by a series of logical inferences, consisting of an a priori unknown number of steps,
and whose outcome is in no way predictable in its entirety. Thirdly, it correctly states that the
scope of AI is to generate a process of machine-implemented inference (that is, to construct a
computer code which can infer a cause from an effect, possibly with the help of some
additional (logical) constraints).
Historically, the first reference to the concept of artificial intelligence is generally credited to
McCarthy, but his original definition was too broad for his times, and gave origin to a number
of misinterpretations and misconceptions which still burden this branch of computer science:
AI is that compendium of computer techniques dedicated to the implementation of a
simulating procedure for any mental process.
Today, this definitionambitious as it may seemis being regarded with growing attention
as our studies in the cognitive sciences (psychology and bioneurology) gain in depth and
breadth. While many of the current schemes to explain mental artifacts like memory, selective
memory, learning, and understanding are only crude approximations or primitive
representations, we are now capable of constructing an approximate model of the complex
chains of biochemical phenomena taking place in the brain. For the time being, though, the
idea of "simulating any mental process" has been replaced by the following assumptions:
a. "intelligence" can be explained and represented as a symbol-manipulating activity;
b. this activity can be embodied in a physical symbol system; in particular, it can be both
described and implemented in such a system;
c. the symbol manipulation necessary to the description and to the implementation of any
"intelligent activity" can be carried out on digital computers;
d. various aspects of human "intelligence" (in particular, what is usually called "logical
reasoning" in common language) can be modeled by such physical symbol systems;
and
e. there may be a "general theory of intelligence" that may originate a "general symbol
system," which in turn might result in a "universal computer code" capable of
describing all "intelligent phenomena": however, even if its discovery should be a long
range goal of AI research, it should not be sought to the detriment of "partial" or
logically "local" symbols systems (which result in "specific" applications).
(This seemingly "technical" dispute (about the possibility of simulating "thought") has
become a philosophical issue between those who negate the possibility of "strong AI" and
underline the uniqueness of the human mental activity, and those who advocate a possible
future essay implementation of a "universal virtual simulator," undistinguishable from a
human brain. The topic obviously exceeds the limits of this essay.)
Recently, another purpose for AI has been put forth: as noted by D.A. Mechner, in spite of
some humbling failures that helped bring things back into a more down-to-earth perspective,
AI methods and devices can "serve as windows on the mind," (that is, help develop a
scientifically pragmatic understanding of the very concept of "intelligence").
3.2 Knowledge
To be able to handle knowledge (that is, to make use of it), one has to make a distinction
between two types of knowledge:
(a) Declarative knowledge, that is, knowledge of facts and relationships. Also called knowing
what. Examples:
i.
ii.
iii.
A turbine is a rotating, bladed wheel that absorbs energy by interacting with a fluid.
A desalination process is a set of components that operate on an input stream of
seawater and, by making use of some external energy source, separate a large portion
of the dissolved salts in the water.
Burning coals with high ash and sulfur content induces more fouling in the boiler
tubes, and produces acid stack gases.
(a) Procedural knowledge, that is, knowledge of the cognitive steps to take to draw certain
conclusions from a data set. Also called knowing how. Examples:
i.
i.
it is rotating;
ii.
iii.
i.
i.
ii.
provide the desalination plant with some sort of external energy input (heat or work);
iii.
if fouling in the tubes of a certain coal-fired boiler is higher than usual, check the
history of the quality of the coals that have been burned in it; and
iv.
if stack gases are high in SO2, either de-sulfurize the gases in a De-SOx unit, or desulfurize the coal before it is fed to the boiler.
In conclusion, it can be seen that under the name of "knowledge" we refer to the complex of
information (relational, numerical, topological, etc.) that constitutes the database in which an
AI procedure searches for solutions. What can be included in this "knowledge" will be made
clearer later in this essay: the list above implies that it can include both data and procedures
(even calculations!), and both deterministic and probabilistic links.
As a final remark, it should be always kept in mind that knowledge is by its own nature
hierarchical. Consider the graph in Figure 1, which depicts our path towards "understanding"
a problem (in this case, the fact that burning coal produces heat).
We can filter through these facts using our common technical sense, and, by the rules of a
domain specific syntax, obtain a set of "catalogued" data. These data do not necessarily
possess any a priori logical structure, but rather have been collected according to some
reproducible criterion; for instance, classifying the coal by its type and chemical composition,
stating the necessary conditions for its ignition, those for a steady combustion, those for
collecting the heat flux, and so on.
a. Operating on these data with a proper semantic, we collect a number of information
bits, which can be used to build some domain-related knowledge. To do this, we
"systematize" the collected knowledge using some domain-specific logic, which can
be called "systemic" or "knowledge about the system" (to continue with the jargon of
our example, combustion process engineering). Notice that the semantic employed
here does not need to be domain-specific (that is, related to the coal), but it is usually
at least field-specific (in this case, related to thermal process engineering).
b. Finally, this first-degree knowledge can be reevaluated, reorganized, and systematized
by examining it in a larger context (that of thermal power plants): we reach a
metaknowledge about our domain, and we, for instance, "understand" that coal can be
replaced by other fuels of different physical and chemical structure.
c. The specific knowledge gathering is thus completed: but we couldif the assigned
design task demands itoperate a further abstraction; that is, put the problem in an
even larger context (cogeneration of heat and power), and so on.
Any automatic knowledge-handling paradigm devised to model ("reproduce") the entire
knowledge build-up procedure must therefore be hierarchical in its structure.
3.3 Expert System (ES)
An ES is an application (that is, a series of procedures aimed at the solution of a particular
class of problems) built around a direct representation of the "general expert knowledge,"
which can be collected (and not necessarily systematized in any way) about those problems.
In general, an ES associates a rule (see below, Section 3.6) with every "suggestion" or piece
of "advice" a specialized expert would give on how to solve the problem under consideration.
The name expert system is inaccurate and misleading, because it carries a sort of mystical
flavor: the systems we are going to deal with are better denominated as knowledge based
systems (KBS).
The class of KBS (or ESs) we describe in this topic do not use the same thought process as
humans: they do not think, nor do they really mimic human expertise. They simply store, in a
special type of database, a rather limited amount of knowledge about some specific features of
the real world, which is put into them by human "experts," in such a way that it can be
formally manipulated, used to infer results and, in general, represented and applied in an
orderly fashion when and as needed. An ES is as trustworthy as the information supplied to it
by the expert(s). One of its inherent advantages, though, is that it can be interrogated about its
reasoning, so that false and wrong knowledge can be easily detected, and cannot be concealed
in obscure computer code. From the structural point of view, a KBS has a software
architecture in which both data and knowledge are separated from the computer program that
manipulates them (which is called the inference engine).
3.4 Knowledge Base (KB)
A knowledge base is the model of the portion of the Universe addressed by an ES. It may
contain a description of all elements of the classes under consideration, a list of their mutual
relations, or a list of rules (including mathematical formulae) that tell us how to operate on
these elements. It can contain procedures that point outside of itself, and create dynamic links
to other procedures or databases.
3.5 Inference Engine (IE)
The inference engine (IE) is the portion of the ES in which the rules are employed to enact the
"reasoning": so, an IE consists of the software which uses the data represented in the KB, in
order to reach a conclusion in particular cases. In general, it describes the strategy to be used
in solving a specific problem, and it acts as a sort of shell that guides the ES from query to
solution. The notion is very general, and it is not possible to provide a universal definition of
the structure of an IE. Some procedures have one single IE, but the present tendency towards
modularization suggests developing separate IEs for separate portions of the knowledgehandling paradigm. The idea that "the IE is where the rules are coded" captures the essence of
the matter well: in practice, there are technical and conceptual complications that make this
simplification impractical for theoretical purposes. The topic is discussed in more detail in
article Expert Systems and Knowledge Acquisition.
3.6 Rules
A rule is a way of formalizing declarative knowledge: it consists of a left hand side (LHS) and
a right hand side (RHS), connected by certain sequences of logical symbols that make the
sequence "LHSconnectivesRHS" equivalent to an "IF (LHS) THEN (RHS)" logical chain.
The LHS is composed of a series of propositions to be regarded as "conditions," joined by
AND or OR logical connectives; the RHS is a proposition which corresponds to an "action."
The general meaning of a rule is: "RHS is TRUE IF either all of the AND-joined conditions,
or at least one of the OR-joined conditions, in the LHS assume a TRUE value." For example:
"IF a machine is rotating AND IF it has blades AND IF a fluid expands in it, THEN the
machine is a turbine."
Notice that a rule is a statement of relationship(s), and not a series of instructions: the
IF...THEN construct discussed here is entirely different from the FORTRAN
IF...THEN...ELSE construct. The latter is a procedural conditional instruction: if A is true,
then do B, else do C; a rule is a logical relational statement between propositions: if p is true
and r is true ... and s is true, then z is true. As a consequence of this definition, it is also clear
that a rule can be viewed as a logical operator acting on a premise p at a certain knowledge
level, and projecting the conclusions z onto a higher level: z becomes a meta-representation
of the set p, r, s, ... . The reverse path is not unique: p, r, s, ... may be logical antecedents
of z , but also of a different conclusion w .
3.7 Facts
Facts are specific expressions that describe specific situations. These expressions can be
numerical, logical, symbolic, or probabilistic. Examples are:
height = 130 m
IF (outlet pressure) IS LESS THAN (inlet pressure)
3.8 Objects
An object is a computer structure that can contain data, data structures, and related
procedures. It is best described as "a package of information and the description of its
manipulation." Another definition is that an object is an entity that combines the properties of
procedures and data, since it performs computations and saves local state. For example:
"Francis hydraulic turbine with H = 300 m, Q = 20 m3/s, whose performance is described in
subroutine FRANCIS." Notice that the first portion of this object is a declarative description,
while the second one is a procedure (which could be, for instance, written in a scientific
language like FORTRAN or C).
The traditional view of software systems is that they are composed of a collection of "data"
(representing some information), and a set of "procedures" (to manipulate the data). The code
invokes a procedure, and provides it with some data to operate on. Thus, data and procedures
are treated as completely distinct entities, while in reality they are not distinct: every
procedure must make some assumption about the form of the data it manipulates. An object
represents both entities: it can be manipulated, like data, but it can describe a procedure as
well, and manipulate other objects (or itself).
Objects communicate through messages, which are the manipulating instructions sent by one
object to the one(s) it wants to modify. A message contains the so-called "selector," which is
the "name" of the desired manipulation: no details of the related procedure are sent with the
message. This is an advantage, since procedures may be "local" to certain objects, and usually
differ depending on the data they are applied to.
3.9 Classes
Classes are a description of one or more similar objects: for example, "hydraulic turbines."
Notice that in this case, the object described in Section 3.8 above would be an instance of the
class introduced here. Since Pelton and Kaplan turbines also belong to this class, the
corresponding objects will bear a significant degree of similarity with the "Francis" object,
but will perforce have some different characteristic. This difference could be in the
description (for example, "Pelton hydraulic turbine"), and/or in the data attributes (for
example, "H = 1000 m and Q = 5 m3/s"), and/or in the procedural part (for example, "whose
performance is described in subroutine PELTON"). The definition of class is recursive: there
may well be sub-classes (for example "Francis", "Pelton", and "Kaplan"), each one with subsub-classes (for example "Francis with high specific speed"), and so on. The limit to the
subdivision is set by common sense: the closer the construct is to a real world entity, the
easier will be the translation of the "real world problem" into a computer model.
The members of a class (or of a sub-class: the property is recursive by its own definition) are
said to inherit all of the properties common to and typical of the class they belong to.
Naturally, they will have some distinguishing feature(s) that makes them different from other
members of the same class, but they participate in the inference process with the properties
they share with their fellow members. For example, sub-classes "Kaplan", "Francis", and
"Pelton", as well as "Steam Turbine" inherit the property "rotating wheel with blades" from
their common ancestor-class "Turbine".
3.10 Induction
Sometimes erroneously called abduction, induction is a logic procedure that proceeds from
effects backwards to their causes. Induction is the method first proposed and formalized in
1620 by Sir Francis Bacon in his Novum Organum, and originally it meant the activity of
deriving a scientific theory from a set of well-organized empirical data and some general rules
of "abstraction." In spite of some literary and theoretical examples to the contrary, induction is
far from being infallible. It must be used with care, or together with some method for
scanning all of the actually known (or foreseeable) causes an event may have, and for
properly choosing between them, possibly by probability methods.
For example, high-frequency vibrations detected on a gas turbine shaft (effect) may be caused
by compressor stall, rotor unbalance in the compressor or in the turbine, fluid dynamic
instabilities in the flow into the turbine, irregular combustion, mechanical misalignment of the
shaft, failure in the lubrication system, bearings wear, or a crack in the gas turbine casing.
Pure induction will not produce a solution in this case, because it cannot decide which one is
the "exact" cause. Thus, additional data, in the form of secondary effects generated by each
one of the causes, must be available. These additional data can be included in the inductive
process to eliminate some of the causes from the list, by matching the effects they would
induce with the actually observed effects.
3.11 Deduction
Deduction is a logic procedure that proceeds from causes forward to their effects. Deduction
is much more reliable than induction (of which it is the logical opposite), but also much more
deterministically limited. For example, compressor stall can cause high-frequency vibrations
on a gas turbine shaft; but if the only cause for gas turbine shaft vibration we have in our
knowledge base is "failure of lubrication system," we will deduce (with a degree of certainty
of 100%!) the wrong conclusion. So, pure deduction must be endowed with "double
checking" procedures, which allow the possibility of different, unexpected causes to be
present. In the case of the example, an oil temperature and pressure check could exclude the
lubrication failure as a cause, and prompt for further analysis. With a perfectly symmetric
procedure to that suggested above for induction, deductive inference must thus be
supplemented by some form of empirical control of the cause-effect database.
available. Figure 2 depicts the operational structure of a blackboard system. Notice that each
"agent" is an inference engine, that there may be different knowledge bases (sources), and that
a separate IE handles the sources-to-blackboard transfer of information.
tree does not guarantee that a solution exists, nor that, if it does exist, it can be found. Notice
that the nodes need not represent homogeneous activities: in Figure 3, the nodes numbered
from 1 to 19 may denote subprocesses, while the nodes indicated by capital letters (A through
G) signal main processes. Notice further that some of the branches are connected to each
other by a fictitious line that clearly represents no physical process. This is a convention used
to signify that the nodes from which the branches originate are AND-nodes: so, the realization
of main process B requires both subprocesses 6 and 7; process C requires either 7 or (3 and
10).
A node is said to be an AND-node if all the branches originating from it are AND-connected;
conversely, it is said to be an OR-node if all the branches originating from it are mutually
exclusive. In general, like in Figure 4, most real processes are AND/OR, that is, they are
connected to branches of both modes (they are said to be hybrid: in the tree depicted in the
figure, C and 9 are AND/OR nodes). Since it is logically convenient to reduce a tree to a set
containing no hybrid nodes, but rather AND and OR nodes only, techniques for eliminating
these mixed-mode nodes have been devised, and one is explained in a specific example
discussed in article Expert Systems and Knowledge Acquisition.
rather, this solution (if it exists) must be sought ("produced") based on the knowledge of the
domain in which a problem is posed (its context). In other words, the model itself must be
heuristically, logically, or symbolicallyconstructed by the solution paradigm. Numerical
methods result therefore in a system of equations whose form implicitly represents the general
features of the solution (features which can in fact be estimated by studying the "asymptotic"
behavior of the given equations); knowledge-based methods result instead in a formulation
that describes the context in which a possible solution arises. Also, numerical methods require
numerical techniques, which solve the governing equations with a degree of accuracy
established by the user, and in principle reducible at will. However, knowledge-based
methods make use of the rules of predicate calculus to infer the solution (deductively,
inductively, or probabilistically) from the available information, and the problem position
consists of both the information and the rules.
A first difficulty is thus to define a representation of the "problem context" that also allows the
representation of inference rules: each rule must be defined by necessity as a (logical)
function of the knowledge it represents, to which it can be applied, and which it, in some way,
manipulates. Consequently, and only in this sense, the problem takes on a double valence: the
procedure and the data on which it is (or may be) called to operate must be represented with
the same set of symbols.
5. Possible Versus Existing Applications of AI to Thermal Systems
An AI procedure is capable of reproducing a series of actions that would be taken by a human
expert (or team of experts) in response to a certain stimulus. In the field of design and
operation of thermal plants, most of the actions taken by the engineer or by the plant operator
are amenable to codification: indeed, design textbooks and manuals, as well as operating
manuals, are just a codification of what past experience has taught us with regard to certain
specific tasks or events. This experience goes well beyond simple instructions such as for an
operating temperature, T, use bolts of such and such material. Most designers agree that there
is a feel for design, which can be gained with specific experience, inferred by intelligent
comparison with similar cases, or communicated by an expert in the field (a teacher, or a
senior engineer). Similarly, plant operators gain their knowledge by becoming familiar with
specific procedures (manuals), by specific experience in the field, or by being taught by
experts (senior operators or design engineers).
What is important here is that this body of knowledge is
or can be
codified in such a way that it can be conceivably communicated to a computer. The
implications are clear: the computer can replace or supplement the human designer/operator
and, as the former is much faster, is more memory-intensive, and performs more evenly and
reliably than the latter, the design or the monitoring process will gain from this
computerization. According to this point of view, virtually all engineering tasks could be
performed with the help of (or entirely by) a properly programmed computer. A list of the
possible specific fields of application of these AI techniques (listed here in increasing level of
complexity) is given below:
Components design
Process design and optimization
Qualitative process modeling: prognostics
A priori thermoeconomic optimization
By contrast, the list of the AI applications actually implemented for the above tasks is not as
extensive as one would expect:
Several applications in production scheduling (most of them not-in-real-time)
Several applications in process control, to assist or replace the operator in
normal plant-conducting mode (notably, in emission monitoring)
Very few design procedures, the majority of which perform a sort of database
scanning activity to select the most proper component from a fixed, pre-assigned list
of possibilities
The causes of the discrepancy between what could be done and what is actually implemented
at the application level are diverse. There are, of course, technical reasons, like the lack of
good programmers and the intrinsic computational intensity of AI techniques (that require,
more than other computational techniques, large amounts of physical resources, and so on),
but they are not substantial, and cannot be invoked to justify such a wide gap between theory
and practice. A more fundamental reason is that AI is a relatively new field, and that its
theoretical foundations are still being discussed, so we are still in the run in phase of the
technology. Another reason is a certain mistrust expressed by senior designers in a tool that
consciously or unconsciously
some of them feel will take their
place. Because of this general attitude, it has sometimes been difficult to get experts of a
specific field to volunteer to transmit their knowledge to a computer code (for more on the
Knowledge Gathering Problem see Expert Systems and Knowledge Acquisition). There has
been passive opposition on the part of plant managers to substitute an expert monitoring
system for their reliable hybrid systems, comprising low-level control and monitoring
devices, governed by well-trained and experienced human operators. There have been
problems in convincing design teams to work together with knowledge engineers in
developing what they see as their electronic duplicate, and so on.
The technical shortcomings will be disposed of in the immediate future by the fast pace
of development in both hardware and software. The second problem
the
immaturity of AI techniques
will gradually be overcome by advances in the
theoretical field that will hopefully lead to the construction of a complete, formally welldefined, coherent, and unified applied cognitive science. The mistrust on the part of
engineers, and the reluctance of plant managers, will probably disappear with time. Since the
adoption of a new technique depends on the degree of comprehension and the amount of
feeling that its (potential) users have developed for it, this is clearly an educational problem,
and will gradually be overcome when new generations of engineers can avail themselves of
new generations of codes that they know they can rely on.
6. Logical Systems
The language of mathematical logic, and in particular its subset of propositional and predicate
calculus, is an ordered and complete set of correlated symbols, which can be used to
represent the general knowledge and the specific or generic facts from which a solution to a
certain problem can be found. Propositional calculus consists of logical propositions joined
by logical connectives (AND
, OR
, NOT
,
IF...THEN
, and EQUAL TO
), and cannot handle either
quantification (p IS LARGER THAN q) or variable binding (FOR ALL q
X, p(X)
is TRUE). Predicate calculus allows for both, and is therefore used in quantitative decision
making procedures, while propositional calculus is best applied in pure inference. Note that
when we refer to predicate calculus in this book, we shall always mean first order
predicate calculus; that is, the form in which the variables are allowed to be assigned
quantitative values, but the functions (equivalent to the predicates) are not.
A logical system is defined as a coherent system consisting of signs and
rules for the correct use of these signs. As such, it is not a theory proper, because it is not
limited to a set of sentences about a specific object or type: it is, in fact, equivalent to a
language. For our purposes, we can identify two main factors in the study of a language: its
syntax and its semantics. (The grammar of the language is of course of paramount importance
for its correct understanding, but we shall assume here that it can always be reduced to a
complete and coherent set of connectives and operators applied to the symbols.) The syntax of
a language establishes the rules for constructing and transforming it: it defines which signs are
to be used in a given context, how these signs can be connected to each other, and how they
are to be manipulated. Semantics is concerned with the interpretation of the signs the
language consists of, and decides about the values to be assigned to each sign, or about
interpretation rules for each sign or group of signs in a given context. For the language to be
(in some sense) meaningful, the individual signs or ordered sets of signs must be able to
represent our knowledge in some application domain. Furthermore, the sign evaluation rules
must guarantee that the transformation (translation) procedures be evaluation-invariant, that
is, that they preserve the value of sign concatenations.
(Actually, there is much more to be said about the concept of language. It can be remarked
here though that the study of logical systems (that is, the theory of language) leads to a better
definition of the possible and permissible extent of our knowledge domain. In his latest work,
Wittgenstein concluded the variable proposition must be of the same order of [meaning:
must possess the same attributes as, note of the present author] the concepts of Reality or
Universe.
Once a language has proved complete and meaningful, it follows that by applying any
combination of transformation rules to a true proposition, another true proposition results.
Notice that, in this context, true means logically true, and can, in fact, be physically
wrong. To avoid confusion, we will use false and wrong as the opposite of true and
correct respectively. The difference can be clearly represented by the following examples:
Example 1
Rule 1 (verbal expression): The efficiency of two components connected in series, such that
the only useful output of the first one is the only useful input of the second one, is equal to the
product of the two component efficiencies.
This rule corresponds to a metacode like:
IF (C2
of C2])
in-series-with
C1, C2
tot = 2.6)
To avoid this, additional rules ought to be inserted, that ensure for instance that 0 <
1 1, 0 < 2 1.
Example 2
Suppose the prescribed task is the generation of a schematic layout of a process to produce Pel
MW of electrical power. The conventional option (which we shall refer to as the A
configuration in the following) is to install an electrical generator whose output is Pel: this
generator will require a mechanical input Pshaft = Pel/gen. To produce Pshaft, the first four options
that come to mind are:
(B) a diesel engine, which requires a fuel input of
;
(C) a turbo gas set, which requires a fuel input of
;
(D) a steam plant, which requires a fuel input of
of electrical power.
So, the following energy conversion chains are all logically true:
(a)
B+A
C+A
D+A
E+A
AND Pauxilaries>0
AND Pnetwork = (power tapped from the net)
AND Pauxiliaries Pel
AND 1
THEN Pauxiliaries = Pnetwork is TRUE
where the factor (small but not equal to zero) takes into account the possibility of electrical
pump drives, or other similar auxiliary drives, but excludes the occurrence of alternative (d)
Both propositional and predicate calculus have been recognized to be complete and
meaningful, and enjoy a higher degree of simplicity and efficiency (power) than other
languages; they are therefore our preferred tools for AI applications.
(A related topic, which will only be mentioned here, is the enormous effort required to realize
a complete and reliable method of translating a natural language into a logical one. Though
the problem is far from being completely formulated, its solution would have a significant
impact on AI techniques, because natural language is the most natural form of expressing our
own knowledge. If such an automated translator could be built into a computer code, it
would then be possible to implement an electronic inference engine which would use not
intuition, but exact logical reasoning to draw correct conclusions from the description of the
world we imparted to it via our natural language. This futuristic scenario is
in
reality
much further away than specialists thought it to be only a few years ago.)
Our aim is to represent our knowledge of a specific application domain in a logical
system
that is, to project our perception of the facts and rules belonging to the
context of that problem onto the symbol-space pertaining to that system. This constitutes the
basis upon which the logical system draws its subsequent inferences, to establish new facts
and new rules (new being used here to signify not explicitly contained in the original
formulation of the problem).
The mechanisms by which inference can be gained are called, in AI terminology,
inferential (or inference) engines: they consist of a set of procedures defined in one of the
logical languages, which derive from a given batch of knowledge some information which
was not explicitly contained in that knowledge. They do so, independently of the physical
value of the initially imparted knowledge, that is, they are indifferent to the correctness
or wrongness of the knowledge base.
There are basically two types of inferential engines: the backward chaining (BC) and the
forward chaining (FC) engine. A BC-engine tries to verify whether a fact or a rule (called a
goal) can be deduced from the knowledge base, and is said to be goal-driven. The search
proceeds backwards from the goal to a set of facts or rules that could be its cause, and are
either in the KB or deducible from it. The method works recursively, until either a reverse
causal path has been established, or the path is broken and the search has failed. To properly
apply the BC-engine, it is of paramount importance to be able to give it a precise, complete,
and detailed description of the goal and all of its attributes. An FC-engine consists of a set of
procedures that try to induce from the facts and the rules contained in the knowledge base
(which it regards as prime causes) some logically acceptable effects. This method, too,
works recursively, so that a chain of primary, secondary, tertiary, ... effects can be built. The
FC-engine is said to be data driven. It has to be remarked at this point that the final effect
inferred by a FC-engine, being at a different abstraction level to the objects forming the
knowledge base, may appear to be strikingly different from them, and give the impression of a
intuitive leap in the inference process. Actually, this is not so, and the causal chain can
always be deduced manually with standard logical operations.
Some of the most recent applications combine the two approaches in an alternate direction
search (ADS). In this case, one branch of the search deduces effects from the knowledge base
(BC-engine), while the other branch tries to induce causes from the goal (FC-engine). An
ADS solution is reached when the last effect from the FC-engine coincides with the first cause
of the BC-engine.
Once completely formulated and implemented, both a BC- and an FC-engine result in
procedures that are not capable of handling a change in the universe of application (a typical
example of which is a problem whose solution depends on the technological level of the
environment, which of course is changing with time). This rigidity can be overcome to a
certain degree by a careful initial formulation of the original inference engine, but it is
generally impossible to foresee all of the changes that can affect an ESs domain of
application.
An approach that can be used to solve this problem is the blackboard approach described in
Section 3.13 above, which will be described here from a slightly different viewpoint.
Knowledge is subdivided into subdomains Si, on each one of which a separate inference
engine, IEi, operates. The entire problem is formulated as a succession of partial solutions,
which may be at different levels of abstraction. These partial solutions, Pi, constitute the
blackboard. There is a main procedure, M, which calls in a certain sequence those IEis that
in its judgment
can produce a modification to the overall state of the
blackboard. Each of these IEs, operating on its own knowledge base, produces some changes
in the current blackboard state. In turn, other IEs examine the state just updated, and decide
whether they too can contribute to a further updating. The whole process is controlled by the
central inference engine, M. The process is repeated until either there is no modification in the
blackboard state and the goal has not been achieved (failure), or the global goal has been
achieved, regardless of whether the blackboard state can be further modified or not. It is
apparent that this description of a blackboard system bears a strict resemblance (a logical
analogy) to the logical scheme used by interdisciplinary design teams: they (try to) put
together different levels and types of knowledge by means of a hierarchically structured
procedure.
In spite of the formal completeness and mathematical rigor of logical (symbolic)
systems, human thinking activities have been shown to exceed their representation
capabilities: that is, there are anomalous knowledge descriptions which, though perfectly
acceptable to a human expert, cannot be formalized (and thus, cannot be manipulated) by any
logical system. The most important examples of such descriptions are:
a.
vague or approximate descriptions (for example, if the evaporation pressure is
too high and there is no reheat, then you might have condensation somewhere in the
final stages of the steam turbine); and
b.
hidden assumptions (for example, for a given steam throttle temperature,
optimize throttle pressure; here, the hidden assumption is the process is not
supercritical).
Vague descriptions can be tackled with a technique based on the so-called fuzzy sets theory,
which is a part of logical sets theory. The way to formalize approximate reasoning is
described briefly in Section 13 of this article, and more extensively in article Expert Systems
and Knowledge Acquisition. Hidden assumptions are more difficult to handle, because they
are invariably case-dependent. Simple, general cases can be treated using non-monotonic
logical systems (their treatment is beyond the scope of this essay). For what concerns
engineering applications, it is advisable to take a more pragmatic approach, and try to embed
in the knowledge base the highest possible degree of flexibility, so that, for instance, there
may be almost redundant descriptions of the same fact which can still be processed by the
inferential engine. It is possible that, in the not too distant future, rigorous and formally exact
omni-comprehensive logical treatments of anomalous thinking processes may be developed,
but in the meantime, simpler, more down-to-earth, and robust applications ought to be
implemented using our present day AI techniques.
7. Semantic Networks
A semantic network consists of a set of nodes and their connections. Each node represents a
single piece of information, and it can be a noun, an individual of a certain class, an activity, a
problem, an event, and so on. There are no explicit limitations on what a node can represent.
The intranodal connections represent the relationships between the nodes, and they cannot be
arbitrary, but are chosen from a predefined set of logically and physically possible
relationships between the objects represented by the nodes. In this sense, the information that
a node can represent is implicitly limited: no node can be included in the network if there is
no relationship available that fits it. This is only a formal problem though, because it is always
possible to define an ad hoc relationship for any "special" or "anomalous" event.
In a semantic network, any triplet "node Aconnectionnode B" represents one of the possible
states of its universe. Given the extreme flexibility of the structure, it is possible to move on
from a triplet to another one, both in the direction of higher complexity, or towards higher
simplicity. However, the most important characteristic of this type of knowledge
representation is that of being absolutely neutral with respect to any goal or direction: they
can (be used to) respond to very different (even conflicting) requests, because the only "facts"
they represent are formal relationships between states.
For this reason, semantic networks have enjoyed a remarkable degree of success in AI
applications: they have resulted in particular macrostructures, called object-oriented
languages, and in some very powerful paradigms, collectively constituting object-oriented
programming. See also Present Applications of Artificial Intelligence to Energy Systems.
8. Fuzzy Sets
The rule-based approach described above is equivalent to solving a problem by first
recognizing and then properly describing the relationships between causes (inputs, "facts")
and effects (outputs, "results"). Here, the underlying assumption is that reality can be fully
described by a binary logic: a fact F has either a "true" or a "false" relationship with a certain
effect E; that is, either F causes E or F does not cause E. Reality does not appear to us under a
strictly binary logic, though, especially in an engineering sense: the concept of uncertainty is
crucial to many a design procedure. Uncertainty may have two origins:
a. It may be due to an obscure, imprecise, or incomplete problem formulation. In this
case, sometimes it may be possible to resolve the uncertainty by a more accurate
knowledge gathering process, but this is often intrinsically impossible.
b. It may be because some of the facts contained in the knowledge base are expressions
of heuristic and/or subjective information.
Specific methods have been devised to handle uncertainty. Historically, of course, probability
methods were developed that deterministically computed the probability of an output if its
relationship with all of the inputs was known, and the probability of each of the inputs was
assigned. Qualitative reasoning (see Section 13) is another, entirely different approach, based
on an ad hoc extension of the logical connectives. By far the most popular method is that
called fuzzy logic, which is based on a very simple but extremely powerful assumption: that
an effect E "belongs" to the domain of the effects of a fact F with a certain degree of certainty,
usually ranging from 0 (F does not produce E) to 1 (F always produces E). The theory of
fuzzy sets (FST) was introduced in the mid 1960s, and it was originally expressed as a
"variable degree of belonging" theory. While in binary logic an object O either belongs or
does not belong to a set S (that is, its degree of belonging is either 0 or 1), in FST this degree
may take all real rational values between 0 and 1. At present, FST is used mostly in control,
because it leads to substantial performance improvements, both in the speed and in the
stability of a control system. It will be discussed at length in article Expert Systems and
Knowledge Acquisition.
9. Neural Networks
The entire body of "classical AI," as discussed in the rest of this essay, is based on the
fundamental notion of cognitive symbolism: knowledge can always be subdivided into
"logical information bits," each one of which can be represented by a proper symbol. The
ensemble of the symbols constitutes a pseudolanguage, similar to our common languages but
devoid of some of their logically tedious shortcomings (double meanings, linguistic
paradoxes, homophonies, referential opacities, and so forth). The language manipulation is
ruled by a grammar (that defines what the symbols mean and under which forms they may be
correctly joined) and by a syntax (that decides on the admissible concatenations of
grammatically correct chains of symbols).
Recently (in the 1980s, but based on an idea developed in the 1940s by Pitts and McCullogh),
a different approach has been proposed, called cognitive connectionism. In this approach,
knowledge is considered to be distributed across a structures system, and actually to be
represented by both the objects of the system, their relationships, and the structural form of
the system itself, so that it becomes impossible to associate a particular portion of knowledge
with any individual part of the system. This is the same as the approach known as "olism,"
which asserts that "the whole is at a higher logical level than the sum of its parts." This
approach has had a large impact on sciences like physics, chemistry, and biology, and it lead
to substantial advances in cybernetics and information science. Since its first development
took place within the frame of a biologists research on the mode of operation of the human
brain, the AI methods that originate from it are collectively named artificial neural networks
(ANN). Though ANN may be used both to model the human brain and to solve cognitive
problems, we shall of course limit ourselves to the second issue. The basic idea is that of
constructing a (large!) network of single devices, each one of which operates according to our
present model of a neurone (Figure 5): it "fires" its response only if the sum of the inputs in a
prescribed time interval exceeds a threshold level.
metarule is found, it is added to the KB, and constitutes a fact whichnot being present at all
in the previous KBhas been apprehended by the code.
Though the entire field of "automatic learning" is being developed at a very fast pace, for our
purposes (that is, for design problems), it can be considered to be still in its infancy. In
practice, most "learning" or "self teaching" codes rely on neural network techniques. It is
conceivable that in the near future (two to five years) AI-based design codes embodying a
combination of mechanical and casual learning will become commercially available.
Applications of conceptual learning and conceptual clustering are still quite far down the
road.
"Memory" is strictly connected to "learning," though it is different enough from it to be
treated separately. By "memory," we mean here the ability of a code to remember concepts
and facts that were not originally in its knowledge base, but were somehow derived by the
inferential engine in its deductive or inductive procedure. Again, the topic is the subject of
rapid development, and it is also specialized enough to exceed the scope of this essay. It is
useful, though, to present here at least an introductory discussion of the general problem and
of some of the different approaches that have been tried. First of all, let us remark that storing
in memory and retrieving from memory are two quite different actions: while storing can be
thought of as a completely mechanistic action, very similar in practice to the building of a
database, retrieving resembles rather a strongly pruned search algorithm. Imagine that an
"expert process synthesizer" is at work on some design: suppose it finds a certain number m
of processes Pj (j = 1, 2, ..., m) that meet the specifications, and stores them in its "working
memory," ready for display to the user. Since each Pj will be stored as an object, it can be
retrieved either as a whole (by "name," so to say), or on the basis of any of its attributes. This
is an example of retrieval based on a direct search by keywords (the name of the object, say
P3, being considered as an attribute). This procedure is very simple, is strikingly similar to an
elementary biological "memory" activity, and can be successfully implemented in practice: it
is, though, limited by its inability to go beyond what is in effect a "labeling" task. If we want
to improve the performance of the retrieval action, we must introduce an inference procedure
that can first draw analogies or draw deductions from the existing data stored in the working
memory, and then choose the proper keywords accordingly. This is tantamount to introducing
an auxiliary expert system into the ES, which drives the search using its own inferential
engine. Let us consider as an example an important feature of an expert process assistant: the
capability to make proper use of waste fluxes and streams. Suppose our hypothetical "expert
process synthesizer" is assembling a process, and has already put together a few components
in such a way that the raw materials undergo some transformation towards the generation of
the products. Assume further that some of these components have two or more outputs, one of
which is the one we call the "main" output (the "product" of that component). This flux will
constitute the input of the next component to be added, while the remaining ones will be
somehow "available," but will not be used for the main production line (the "secondary
fluxes" or "by-products"). It would be desirable (because this is the way a human expert
would proceed in the design) to keep track of these streams and fluxes so that they can be
considered for use as "potential inputs" at some later stage. This can be achieved in principle
by constructing an auxiliary ES that performs the following operations:
a. For each component added to the process, the ES stores all of the by-products into the
working memory as objects belonging to a class "Utilities", each object being
identified by its proper attributes.
b. When considering which new component can be added to the existing partially
assembled process structure, the fluxes placed in memory as "Utilities" are scanned
first, and those which can be used to feed the new component are "recycled" internally.
c. After the process has been completely assembled (that is, a structure has been
constructed which transforms the raw materials into the specified products), the
remaining "Utilities" are inspected again, to see if any of them can be internally or
externally recycled to "improve" the process. The improvement can consist of more
units of output for unit of input (higher productivity), reduced use of external utilities
(higher efficiency, lower fuel consumption), or the contemporary production, in
addition to the specified output, of some other commercially or technically desirable
by-product.
d. If several process structures have been found, all of them satisfying the design
requirements, the auxiliary ES can even rank them in descending order of
"desirability," by considering more desirable those which have a higher productivity or
efficiency, or which generate the more convenient by-products.
This example shows that a desirable attribute of a "memory" is its ability to detect qualitative
analogies between the objects stored in the physical memory area and those we are searching
for. The procedures that perform this task are (quite complex) ESs themselves, and the logical
operation they perform is called matching. Let us denote as "data" the objects stored in
memory, and as "targets" those we are looking for. A "match" is said to exist between one
piece of data and a target if:
a. The list of the attributes of the two objects is the same. This is the most obvious
match, and is called identity.
b. All of the attributes of the target are contained in the list of attributes of the data. In
this case, a further scanning action is necessary to verify that the target is indeed an
instance of the class "data." The attributes of the target are said to be embedded in
those of the data.
c. There is no immediate correspondence between the two lists of attributes, but it is
possible to derive functional similarities, in the sense that there exist metalevels (at
least one) at which the two lists share some metaattributes.
In cases (a) and (b), the target is similar to (in case (a), identical with) the data, and retrieval is
immediate; in case (c), a further intelligent search is required at the detected metalevel. Notice
that the keyword search can be included in case (c) (the metalevel being identified by the
chosen keyword).
In the field of AI at large, extensive research on "memory" applications is being performed in
the fields of image and sound recognition. While the developments areat presentin no
way directly applicable to the field of process design, it would be advisable for interested
readers to consult the monographs on this topic listed in the bibliography.
11. Search Methods
As repeatedly stated in the previous sections, knowledge-based methods are in essence search
methods which try to scan a (possibly very large) solution space to find the or at least a
solution for the problem under consideration. Therefore, search techniques, and the problem
of the most appropriate search method, are a central issue in AI. A "search problem" is
characterized by a description of an initial and of a final state, not necessarily represented at
the same abstraction level. In this context, a "state" is the description of a well-identified
situation, or more precisely, the description of a structured objective situation. The search is
conducted by means of logical operators which, when applied to a certain state (or situation),
transform it into a different state. These operators can be logical inference rules,
representations of physical laws, or purely conventional ad hoc rules considered valid for that
particular search. Two sets of states are associated with each of these operators: the domain
space, defined as the set of all states to which the operator may be applied, and the co-domain
(or projection) space, which is the set of all states which may be generated by the operator. If
an operator O, applied to a state P, transforms it into a state R, then R is said to be Ps first
successor, and correspondingly P is said to be Rs first ancestor. There can be second, third, ...
ancestors and successors: given an initial state I, the set S composed of all of its successors is
called the search space, and obviously corresponds to the solution spaceif a solution exists,
it is a member of this set.
There are several search "strategies" or "techniques," including:
uniform irrevocable search strategy, "Hill climbing" (all of the so-called "gradient
techniques" belong to this class);
uniform, "depth first" search, "LIFO" ("last-in-first-out": the last state which was
identified will be the first one to be operated on);
uniform, "breadth first," "FIFO" ("first-in-first-out": the first state which is identified
will be the first one to operate on); and
heuristic (possibly, weighed) search for the first local optimum.
These algorithms are well known, and several practical implementations have been published:
interested readers are referred to the book by Press et al. which contains detailed descriptions
and macrocode schemes for many search procedures.
12. Handling of Constraints
Every engineering design task is subject to a series of physical, technological, procedural,
environmental, or legal requirements that limit the choice of system configurations. Often,
these restrictions are so severe that they direct the designer towards a specific set of
configurations; and at times they are so strict that this set reduces to one configuration only.
When the design task is formulated as an engineering problem, these restrictions become
constraints, which must be accounted for by the design paradigm. Constraints can be
subdivided into four fundamental categories:
A particular metarule that makes coping with approximate knowledge representations easier is
the keyword rule (see also above, Section 3.9). It is again a hierarchical rule, which considers
only certain "key" features of the given facts to be important, and tries to match them with the
domains of the available inference rules. This is a very powerful approach, provided the
choice of the keywords is proper, which is of course a problem in itself (that ought to be
solved by the domain expert).
Besides syntactical and keyword analysis, there are procedures derived from mathematical
theories that allow us to compute the conditional probability of a correlated set of events:
these are the so-called Bayesian conditional probability method, the certainty approach, and
the Dempster-Shafer theory. The description of these applications, and of the underlying
theories, is outside the scope of this essay, and interested readers are referred to the excellent
treatment presented by Sriram. What we can say here, though, is that all of these methods
suffer from a fundamental weakness: they rely, in one way or another, on quantitative
measures of the degree of uncertainty which can be attached to a certain proposition. These
measures are usually expressed by real numbers between -1 and 1 (-1 representing absolute
unlikeness, and 1 absolute certainty), and can be assigned only by domain experts:
nevertheless, it has been found that they tend to cover a foggy area in the interval [-1,1] rather
than assume a precise numerical value (for example, 0.4). Furthermore, the quantitative
measure of this uncertainty is strongly dependent on both the immediate and remote
experience of the domain expert, and the physical and logical environment of the case under
study. In the absence of better methods, we find that fuzzy set theory (Section 8 above) is still
an engineers best choice when dealing with approximate knowledge.
A different type of problem is posed by ambiguous knowledge. Ambiguity differs from
vagueness: probably the best definition is given by Quine: "ambiguous terms may be at once
clearly true of some objects and clearly false of them." It can be shown that ambiguities are in
fact invariably linguistic (semantic) problems, and therefore they should be resolved in the
phase of problem formulation.
Another common type of knowledge, closely related to approximate reasoning, is that
expressed under the form of belief. It is approximate, because it does not result in "IF p THEN
q" statements, but it turns out it can be treated in a very effective way. Without going into
details, probably the best way of treating knowledge expressed under the form of a believed
fact (such as "I believe that the number of feedwater heaters in a conventional steam plant
never exceeds eight") is that of translating it into a form of conventional knowledge about the
believed fact and some unlikely or unusual contrary fact (such as "EITHER (number of
feedwater heaters) LESS-OR-EQUAL-TO 8 OR probability(plant is a conventional steam
plant) 0.01"). It is important to remark here that not all such translations have to refer to
probabilistic utterances: what is essential is that the fact that would disprove the belief is
identified by a quantifiable degree of unlikeliness.
The reason why we should seek ways to handle approximate knowledge is that it is much
more powerful than exact knowledge: it can allow for inference in cases in which an exact
treatment would fail. For example, when designing a Brayton-based gas turbine plant, we may
know in advance the maximum allowable cycle temperature, T3 = Tmax in Figure 6, but we
need to know the pressure ratio and the inlet temperature T1 to compute the required fuel-toair mass ratio, . If the facts we assign are "T1, p1, p2, ", the code would be unable to cope
with situations in which T3 is different from the specified Tmax. However, if an approximate
fact is substituted, like "0.01 0.05", the domain of application of the code is
substantially increased. Similarly, approximate facts are very useful in monitoring and control
applications.
Related Chapters
Click Here To View The Related Chapters
Glossary
Abstraction
Backward
chaining (BC)
Backtracking
Belief
Blackboard
system
Boiler
Certainty
Class
Clause
Combustion
chamber
Degree of
membership
Deterministic
programming
Direct design
problem
Domain expert
Embedding
Emission
monitoring
Encapsulation
Extended exergy
accounting (EEA)
Extraction pump
Facts
Feasibility study
Feedback
Feedwater pump
representation
Pump
Thermoeconomics : Second-law based cost optimization techniques. The costs (in monetary
units) are computed with the aid of entropic and exergetic considerations.
Time-marching : A numerical integration technique, based on the discretization of the
time axis. Given the value of a certain function f at t = to, its value at t = to
+ t is given by f(to + t) = f(t) + M to,f(to),t , where M is the socalled time-marching operator.
Transfer function : A logical, symbolic, or mathematical operator expressing the functional
relationship between the inputs I and the outputs O of a process or
component:O = (I,O) I .
Weighted search : A search process in which each branch of the decision tree carries a
"weight" or "penalty" function:the objective of the search is to find the
path(s) for which the sum of the weights on the n branches constituting
the path(s) is minimal.
Well-posed
: In a strict mathematical sense, a problem is said to be well-posed if it has
one unique solution, and this solution depends continuously on the
relevant data of the problem (boundary and initial conditions, values of
the coefficients, and so on).
Well-structured : A problem that can be described in terms of numerical variables,
possesses a univocally and well-defined objective function, and admits of
an algorithmic routine for its solution.
Bibliography
Brooks R. A. (1991). Intelligence without representation. Artificial Intelligence 47(1). [This is a rather
specialized essay on methods and techniques of non-algorithmic reasoning.]
Charniak E., McDemmott D. (1985). Introduction to Artificial Intelligence. Addison-Wesley. [An early, but
complete and well thought out textbook, on AI topics. Some of the definitions it gives are dated, but it may be
very useful to observe their developments in todays concepts.]
Deutsch D. (1997). The Fabric of Reality. Allen Lane. [This book was written for a large audience, and therefore
it shows some contamination between scientific precision and imaginative scientific fiction. Nevertheless, the
portion that deals with the representation of reality is very important for a correct understanding of the concept
of qualitative modeling.]
Freeman J. A. and Skapura D. M. (1992). Neural Networks. Addison-Wesley. [A precise, complete, and selfcontained specialized textbook.]
Garland W. M. J. (1996). The role of knowledge-based systems in heat exchanger selection, design and
operation. New Developments in Heat Exchangers (ed. N. Afgan et al.). Gordon & Breach. [An interesting
example of a direct application of AI techniques to an apparently "trivial" engineering problem. The specific
results may be outdated at present, but the method is still valid.]
Giarratano J. and Riley G. (1989). Expert Systems: Principles and Programming. Boston, PWS Publ. Co. [A
very useful guide to the practical development and implementation of expert systems. It is outdated as far as the
programming language is concerned (at the time of writing, most AI programmers would use C++), but is very
interesting for its insistence on a conceptually and formally structured approach.]
Gillies D. (1996). Artificial Intelligence and Scientific Method. [A confutation of the accusation that AI methods
are 2a-scientific, because of their insistence on the importance of the qualitative side of a scientific paradigm.]
Gong M. and Wall G. (1997). On Exergetics, Economics and Optimization of Technical Processes to meet
Environmental Conditions. (Proc. TAIES97, World Publishing Corp., Beijing 1997). [Though written without
explicit reference to AI methods, this is an interesting example of how engineering paradigms can be posed in a
formally qualitative fashion. An interesting example of a "new" engineering approach.]
Gregory R. (1994). Visual intelligence. What is Intelligence? (ed. J. Khalfa). Cambridge, UK: Cambridge
University Press. [By examining the logical structure of human vision, the author derives many fundamental
conclusions about the act of "intelligent understanding."]
Hopfield J. J. (1982). Neural networks and physical systems with emergent collective computational capabilities.
Proceedings of the National Academy of Sciences 79. [A specialized paper describing the idea that an increase in
the complexity of a system may bring about the appearance ("emersion") of discontinuous changes in the
properties of the system.]
Ignizio J. P. (1991). Introduction to Expert Systems. McGraw Hill. [A complete and well written textbook on
expert systems. Some of the definitions it gives are still in use today.]
Kowalski R. A. (1979). Logic for Problem Solving. North Holland. [An early, formal approach to the qualitative
solution of practical engineering problems. The author was one of the earliest advocates of object-oriented
programming.]
Liebowitz J. and De Salvo D. A., eds. (1989). Structuring Expert Systems. Yourdon Press. [A specialized essay
with a strong accent on the necessity of using an intrinsically structured knowledge gathering approach to
construct expert systems.]
Lloyd J. W., ed. (1990). Computational Logic. Springer Verlag Basic Res. Series. [A collection of articles on the
transposition of relational logic rules into computer instructions.]
McBride R. D., OLeary D. E. (1993). The use of mathematical programming with AI and ES. European
Journal of Oper. Res. 70(1). [An early precursor of MATHEMATICA, MAPLE, MATLAB, and the like. Strong
emphasis was given to the possibility of adopting known mathematical/logical constructs to perform qualitative
computations.]
McClelland J. and Rumelhart D. (1986). Parallel Distributed Processing, Vol.12. MIT Press. [An attempt to
investigate the possibility of translating into cognitive science the mathematical properties of parallel
computing.]
McCulloch W. S. and Pitts W. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of
Mathematical Biophysics, 5. [A pioneering work describing an attempt to formalize elementary nervous activity.
Vastly outdated today, but strikingly innovative for its times.]
Mechner D. A. (1998). All systems GO. The Sciences. New York Academy of Sciences, January/February 1998.
[A laymans exposition of some applications of AI techniques, together with extensive and useful comments and
comparisons.]
Negoita C. V. (1993). Expert Systems and Fuzzy Systems. Bejamin Cummings Pub. [A complete and well
thought out textbook on two AI topics that the author considers separately. More recently, fuzzy sets have been
developed into autonomous techniques that can be incorporated in expert systems as a part of their inferential
engine.]
Papalambros P. Y. and Wilde D. J. (1988). Principles of Optimal Design: Modeling and Computation.
Cambridge, UK: Cambridge University Press. [An attempt to formalize "optimal design."]
Parsaye K. and Chignell M. (1988). Expert Systems for Experts. John Wiley & Sons, Inc. [A rather specialized
book on expert systems. It contains a wealth of qualitative procedural information.]
Press W. C., Teukolsky S. A., Vetterling W. T., and Flannery B. P. (1985). Numerical Recipes in C. McGraw Hill.
[A sort of "cookbook" of scientific computational routines. In this re-edition of an earlier FORTRAN version,
even heavier accent is placed on the necessity of implementing intrinsically structured programming techniques.
Though the authors never mention the words "object-oriented programming," this textbook may be seen as
marking the borderline between deterministic and O-O programming techniques.]
Quine W. V. O. (1960). Word and Object. MIT Press. [An exceptionally deep and complete analytical critique of
the concepts of cognitive science. The language (and its intelligence) is seen as the frame of reality.]
Rasmussen J. (1986). Information Processing and Human-Machine Interaction: An Approach to Cognitive
Engineering. North Holland Ser. in Syst. Sci. and Eng. [An early description of the "cognitive" approach to
solving engineering problems. Very limited emphasis on real design applications: this book is dedicated to the
investigation (fundamental at that time) of the guiding principles of knowledge gathering and processing.]
Rettaroli M. and Sciubba E. (1994). MASAI: a code for the symbolic calculation of thermal plants. ASME
PD/64/3. (Proc. ESDA Conf., London, 1994). [An attempt to substitute formal symbolic procedures to numerical
solvers. Still, only entirely deterministic procedures are proposed.]
Rich E. (1983). Artificial Intelligence. McGraw-Hill. [A basic, older textbook on the fundamentals of artificial
intelligence concepts and applications. Very few examples of applications to engineering design: in the 1980s
there was still a lack of first-level applications, and the accent was being put on conceptual topics.]
Sciubba E. (1998). Toward automatic process simulators: Part II. An expert system for process synthesis. J. Eng.
for GT and Power 120(1). [A formal description and detailed explanation of how a process synthesizer works.
The discussed applications are limited to fossil-fuelled power plants.]
Sciubba E. and Melli R. (1998). Artificial Intelligence in Thermal Systems Design: Concepts and Applications.
New York: NOVA SCIENCE Publishers. [An introductory but complete and well written textbook on AI topics.
A very strong bias towards applications results in a compression of the most theoretical topics, which are
referred to but seldom thoroughly discussed. A textbook for AI users.]
Scott A. (1996). The hierarchical emergence of consciousness. Math. & Comp. in Simul. 40. [A reflection on the
emergence of a more complex, hyperlogical structure from a deeper analysis of living and non-living systems.]
Shank R. and Birnbaum L. (1994). Augmenting Intelligence. What is Intelligence? (ed. J. Khalfa). Cambridge,
UK: Cambridge University Press. [Techniques to increase induction and deduction capabilities. The article
contains some strong critique to the current misconceptions about "intelligence" and its representation.]
Shannon C. E., McCarthy J., and Ross A. W. (1956). Automata studies. Annals of Mathematic Studies 34.
Princeton University Press. [Early reflections on the implications of Turings studies on "intelligent machines."]
Shoham Y. and Moses Y. (1989). Belief as defeasible knowledge. Proc. IJCAI-89, 1989. [An interesting essay on
the possibility of incorporating an apparently vague and ambiguous statement of belief into a knowledge
manipulating engine that may draw proper inferences from it.]
Sriram R. D. (1997). Intelligent Systems for Engineering. Springer Verlag. [A complete, well-written, wellbalanced textbook on AI. For the quantity and quality of information contained, it constitutes a necessary
reference for workers in this area.]
Widman L. E., Loparo K. A., and Nielsen N. R. (1989). Artificial Intelligence, Simulation and Modeling. J.
Wiley. [A predecessor of Srirams book. The emphasis here is more on applications, with special accent on the
relationship between an "object" and its "model."]
Wittgenstein L. (1996). Philosophie. Rome: Donzelli Pub. (in Italian and German). [A collection of the
philosophers reflections on the relations between language, mind, and "reality." With extraordinary foresight,
some of the issues raised today by knowledge gathering and representation methods are clearly stated and
"discussed" in the peculiar style of Wittgensteins latest works.]
Biographical Sketch
Enrico Sciubba is a professor at the Department of Mechanical and Aeronautical Engineering of the University
of Rome 1 "La Sapienza" (UDR1), Rome, Italy. He received a masters degree in mechanical engineering from
UDR1 in 1972. From 1972 to 1973, he was a research assistant at the Chair of Turbomachinery in the same
university. From 1973 to 1975 he worked as research engineer in the research and development division of
BMW, Munich, Germany, where his tasks included the design, development, and testing of advanced i.c.
engines. After returning to UDR1 as a senior research assistant from 1975 to 1978, he enrolled in the Graduate
School of Mechanical Engineering, majoring in thermal sciences, at the Rutgers University, New Brunswick, NJ,
USA, where he was granted his Ph.D. in 1981. From 1981 to 1985 he was assistant professor at the Catholic
University in Washington D.C., USA, teaching thermal sciences. He returned to the Department of Mechanical
and Aeronautical Engineering of UDR1 as a faculty member in 1986. He lectures on turbomachinery and energy
systems design, at both undergraduate and graduate level. His research activities are equally divided in three
main fields: Turbomachinery design and CFD applications; Energy systems simulation and design; Applications
of AI-related techniques and procedures to the design, synthesis, and optimisation of complex energy systems.
His publications include more than thirty journal articles (mostly in international refereed journals in the field of
energy and applied thermodynamics), and over eighty refereed papers at international conferences. He published
one book on AI applications for NOVA Science, USA, and is writing a turbomachinery book for J. Wiley& Sons.
Dr. Sciubba is associate editor for three major international journals in the field of energy conversion, and is a
reviewer for several more.