You are on page 1of 52

Progress in Energy and Combustion Science 29 (2003) 515–566

www.elsevier.com/locate/pecs

Artificial intelligence for the modeling and control


of combustion processes: a review
Soteris A. Kalogirou*
Department of Mechanical Engineering, Higher Technical Institute, P.O. Box 20423, Nicosia 2152, Cyprus
Received 9 July 2002; revised 11 July 2003; accepted 11 July 2003

Abstract
Artificial intelligence (AI) systems are widely accepted as a technology offering an alternative way to tackle complex and ill-
defined problems. They can learn from examples, are fault tolerant in the sense that they are able to handle noisy and incomplete
data, are able to deal with non-linear problems, and once trained can perform prediction and generalization at high speed. They
have been used in diverse applications in control, robotics, pattern recognition, forecasting, medicine, power systems,
manufacturing, optimization, signal processing, and social/psychological sciences. They are particularly useful in system
modeling such as in implementing complex mappings and system identification. AI systems comprise areas like, expert
systems, artificial neural networks, genetic algorithms, fuzzy logic and various hybrid systems, which combine two or more
techniques. The major objective of this paper is to illustrate how AI techniques might play an important role in modeling and
prediction of the performance and control of combustion process. The paper outlines an understanding of how AI systems
operate by way of presenting a number of problems in the different disciplines of combustion engineering. The various
applications of AI are presented in a thematic rather than a chronological or any other order. Problems presented include two
main areas: combustion systems and internal combustion (IC) engines. Combustion systems include boilers, furnaces and
incinerators modeling and emissions prediction, whereas, IC engines include diesel and spark ignition engines and gas engines
modeling and control. Results presented in this paper, are testimony to the potential of AI as a design tool in many areas of
combustion engineering.
q 2003 Elsevier Ltd. All rights reserved.
Keywords: Artificial intelligence; Expert systems; Neural networks; Genetic algorithms; Fuzzy logic; Combustion; Internal combustion engines

Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
2. Expert systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
3. Artificial neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
3.1. Biological and artificial neurons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
3.2. Artificial neural network principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
3.3. Network parameters selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
3.4. Artificial neural network architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
3.4.1. Recurrent type architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
3.4.2. Feedforward with multiple hidden slabs architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
3.4.3. General regression neural network (GRNN) architecture . . . . . . . . . . . . . . . . . . . . . . . . 527
3.4.4. Group method data handling neural network (GMDH) architecture. . . . . . . . . . . . . . . . . 529
4. Genetic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
5. Fuzzy logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533

* Tel.: þ357-22-406466; fax: þ357-22-494953.


E-mail address: skalogir@spidernet.com.cy (S.A. Kalogirou).

0360-1285/03/$ - see front matter q 2003 Elsevier Ltd. All rights reserved.
doi:10.1016/S0360-1285(03)00058-3
516 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

5.1. Membership functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534


5.2. Logical operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
5.3. If– then rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
5.4. Fuzzy inference system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
6. Hybrid systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
7. Applications of artificial intelligence techniques in combustion processes . . . . . . . . . . . . . . . . . . . . . . 538
7.1. Applications of expert systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538
7.2. Applications of neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
7.2.1. Combustion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
7.2.2. Internal combustion engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548
7.3. Applications of genetic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
7.3.1. Combustion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
7.3.2. Internal combustion engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
7.4. Applications of fuzzy logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
7.4.1. Combustion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
7.4.2. Internal combustion engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
7.5. Applications of neuro-fuzzy logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
7.5.1. Combustion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
7.5.2. Internal combustion engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 560
7.6. Other hybrid systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
7.6.1. Combination of genetic algorithms and fuzzy logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
7.6.2. Combination of artificial neural networks and genetic algorithms . . . . . . . . . . . . . . . . . . 562
8. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564

1. Introduction Machinery can outperform humans physically. Simi-


larly, computers can outperform mental functions in limited
The possibility of developing a machine that would areas notably in the speed of mathematical calculations. For
‘think’ has intrigued human beings since ancient times. In example, the fastest computers developed are able to
1637, the French philosopher-mathematician Rene Des- perform roughly 10 billion calculations per second.
cartes predicted that it would never be possible to make a However, making computers that are more powerful will
machine that thinks as humans do. However, in 1950, the probably not be the way to create a machine capable to
British mathematician and computer pioneer Alan Turing think. Computer programs operate according to set
declared that one day there would be a machine that could procedures, or logic steps, called algorithms. In addition,
duplicate human intelligence in every way. most computers do serial processing such as operations of
Artificial Intelligence (AI) is a term that in its broadest recognition and computation are performed one at a time.
sense would indicate the ability of a machine or artifact to The human brain works in a manner called parallel
perform the same kinds of functions that characterize human processing, performing a number of operations simul-
thought. The term AI has also been applied to computer taneously. To achieve simulated parallel processing, some
systems and programs capable of performing tasks more supercomputers have been made with multiple processors to
complex than straightforward programming, although still follow several algorithms at the same time.
far from the realm of actual thought. According to Barr and AI consists of five major branches, i.e. expert systems,
Feigenbaum [1] AI is the part of computer science artificial neural networks (ANNs), genetic algorithms (GA),
concerned with designing intelligent computer systems, fuzzy logic and various hybrid systems, which are
i.e. systems that exhibit the characteristics we associate with combinations of two or more of the branches mentioned
intelligence in human behavior-understanding, language, previously.
learning, reasoning, solving problems and so on. Logic programs called expert systems allow computers
It should be noted that solving a computation does to ‘make decisions’ by interpreting data and selecting from
not indicate understanding, something a person who among alternatives. Expert systems take computers a step
solved a problem would have. Human reasoning is not beyond straightforward programming, being based on a
based solely on rules of logic. It involves perception, technique called rule-based inference, in which pre-
awareness, emotional preferences, values, evaluating established rule systems are used to process the data.
experience, the ability to generalize and weigh options, Despite their sophistication, systems still do not approach
and many more. the complexity of true intelligent thought.
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 517

Nomenclature t network output (target)


T binary operator representing the multiplication
b1 constant term (bias)
of membership functions mA ðxÞ and mB ðxÞ
E RMS error
w weight values
I number of input parameters
Greek variables
m momentum factor
o desired output vectors over all pattern ðpÞ
a activation for each node
O number of output parameters
bi weighted average obtained by combining all
p patterns
input numerical information from upstream
p constant parameter
nodes
Pi number of training patterns
d error term
q constant parameter
1 learning rate which determines the size of the
r constant parameter
weight adjustments during each training iteration
S binary operator representing the addition of
m membership function
membership functions mA ðxÞ and mB ðxÞ

ANNs are collections of small individually intercon- Hybrid systems combine more than one of the technol-
nected processing units. Information is passed between ogies introduced above, either as part of an integrated
these units along interconnections. An incoming connection method of problem solution, or to perform a particular task
has two values associated with it, an input value and a that is followed by a second technique, which performs
weight. The output of the unit is a function of the summed some other task. For example, neuro-fuzzy controllers use
value. ANNs while implemented on computers are not neural networks and fuzzy logic for the same task, i.e. to
programmed to perform specific tasks. Instead, they are control a process, whereas in another hybrid system a neural
trained with respect to data sets until they learn patterns used network may be used to derive some parameters and a GA
as inputs. Once they are trained, new patterns may be might be used subsequently to find an optimum solution to a
presented to them for prediction or classification. ANNs can problem.
automatically learn to recognize patterns in data from real For the modeling, prediction of performance and control
systems or from physical models, computer programs, or of combustion processes, analytic computer codes are often
other sources. An ANN can handle many inputs and produce used. The algorithms employed are usually complicated
answers that are in a form suitable for designers. involving the solution of complex differential equations.
GAs are inspired by the way living organisms adapt to These programs usually require large computer power and
the harsh realities of life in a hostile world, i.e. by evolution need a considerable amount of time to give accurate
and inheritance. The algorithm imitates in the process the predictions. Instead of complex rules and mathematical
evolution of population by selecting only fit individuals for routines, AI systems are able to learn the key information
reproduction. Therefore, a GA is an optimum search patterns within a multi-dimensional information domain. In
technique based on the concepts of natural selection and addition, many of the AI systems like, neural networks are
survival of the fittest. It works with a fixed-size population fault tolerant, robust, and noise immune [2]. Data from
of possible solutions of a problem, called individuals, which combustion processes being inherently noisy are good
are evolving in time. A GA utilizes three principal genetic candidate problems to be handled with AI systems.
operators: selection, crossover, and mutation. When dealing with research and design associated with
Fuzzy logic is used mainly in control engineering. It is combustion processes, there are often difficulties encoun-
based on fuzzy logic reasoning which employs linguistic tered in handling situations where there are many
rules in the form of IF– THEN statements. Fuzzy logic and variables involved. To adequately model and predict the
fuzzy control feature a relative simplification of a control behavior of combustion requires consideration of non-
methodology description. This allows the application of a linear multi-variate inter-relationships, often in a ‘noisy’
‘human language’ to describe the problems and their fuzzy environment. For example, in the prediction of perform-
solutions. In many control applications, the model of the ance, modeling or control of a combustion process from
system is unknown or the input parameters are highly the point of view of energy efficiency, prediction of boiler
variable and unstable. In such cases, fuzzy controllers can be emissions, or control of an internal combustion (IC)
applied. These are more robust and cheaper than conven- engine, there are numerous variables involved. The
tional PID controllers. It is also easier to understand precise interactions of the variables are not fully under-
and modify fuzzy controller rules, which not only use stood or cannot easily be modeled.
human operator’s strategy but, are expressed in natural Analytical techniques have been very successful in the
linguistic terms. study of the behavior of engineering systems such as heat
518 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

transfer, thermal processes, and other areas. While the 2. Expert systems
analytical models have been valuable in understanding
principles and useful where less than optimal designs were An expert system is a computer program in which the
acceptable with the advent of digital computers, numerical knowledge of an expert on a specific subject can be
methods became much more attractive than analytical incorporated in order to solve problems or give advice [3].
solutions, as they could handle more complex and realistic Thus, an expert system is a program capable of emulating
situations. Numerical methods have their limitations as human cognitive skills such as problem solving, visual
well. They cannot easily account for practical limitations, perception and language understanding. An expert system
they tend to perform well at analyzing a situation but not must be distinguished from conventional application
so well as a designer’s tool for quickly looking at options. programs, as it exhibits certain characteristics that the latter
Additionally, the number of variables that can be does not have. Specifically, an expert system [4]:
considered is still limited and numerical solutions cannot
usually be obtained directly. Frequently, complex systems 1. Can be designed for solving complex problems ordina-
for which there is no exact model of behavior need to be rily requiring human experts.
designed. Furthermore, designers have to design or deal 2. Embody both expert knowledge and logically inferring
with complex systems where expected performances are means; the former should be stored in a symbolic
completely unknown. Much of the complexity is due to declarative language; the latter would consist of heuristic
the multi-parameter and multi-criteria aspects of a search and reasoning procedures for utilizing the stored
system’s design which are not easily handled using rules information.
of thumb, analytical methods, physical models or 3. Is capable of achieving high performance in narrowly
numerical methods. specified domains of incremental development, dealing
Many of the combustion problems are exactly the types with incomplete or uncertain data, handling unforeseen
of problems and issues for which AI approach appear to be situations and explaining or justifying its results.
most applicable. In these models of computation, attempts 4. Is limited to a specific area of human expertise.
are made to simulate the powerful cognitive and sensory 5. Can be designed to grow on an evolutionary basis,
functions of the human brain and to use this capability to improving its ‘expertise’ as it grows.
6. Can represent the expertise using facts and rules.
represent and manipulate knowledge in the form of patterns.
7. Is able to use other knowledge representation methods
Based on these patterns, neural networks, for example,
to handle knowledge, which is not well expressed as
model input – output functional relationships and can make
rules.
predictions about other combinations of unseen inputs.
Many of the AI techniques have the potential for making
What is expected from an expert system is to deal with
better, quicker and more practical predictions than any of
problems of scientific or commercial nature and to provide
the traditional methods.
correct solutions in a reasonable time. Moreover, similarly
AI analysis is based on past history data of a system and to the human expert, it should be able to provide
is therefore likely to be better understood and appreciated by explanations about the conclusions reached, by displaying
designers than other theoretical and empirical methods. AI in some way all the steps of its reasoning process.
may be used to provide innovative ways of solving design Additionally, it should be able to provide explanations
issues and will allow designers to get an almost instan- about the course of action it will follow according to the
taneous expert opinion on the effect of a proposed change in user’s answers to the questions that the system is
a design. programmed to ask [3,4]. More details on expert systems
The objective of this paper is to introduce briefly the can be found in Refs. [3 – 8].
various AI techniques and to present various applications in An expert system usually consists of a knowledge base,
combustion modeling and control problems. The scope is to an inference mechanism, an explanation component, a user
demonstrate the possibilities of applying AI to combustion interface and an acquisition component [5]. The knowledge
processes. This will be achieved by way of presenting base usually contains two different databases, a static and a
applications of AI in various combustion related problems. dynamic. The static database contains the knowledge about
The problems are presented in a thematic rather than a the domain, represented in a certain formalism. It is created
chronological or any other order. Problems presented once, when the system is being developed by the user, but it
include two main areas: combustion systems and IC can be modified at runtime (with addition of new facts,
engines. Combustion systems include boilers, furnaces and deletion of some part of the existing knowledge or alteration
incinerators modeling and emissions prediction, whereas, IC of some part of it). The dynamic database may be enriched
engines include diesel and spark ignition engines and gas during each execution of the program but the information is
engines modeling and control. This will show the capability lost when the execution is terminated. It is used to store all
of AI techniques as tools in combustion processes predic- information obtained from the user, as well as intermediate
tion, modeling and control. conclusions (facts) that are inferred during the reasoning
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 519

process. The knowledge base contains all the facts, rules and The user interface determines how the expert system
procedures, which are important for problem solving in a interacts with the user. It provides the user with the
specific area of application. explanations on the system’s performance obtained from
Object oriented programming is usually used in the the user, the information that the system needs in order to
development of expert systems in which certain procedures perform and presents the results obtained during the
and functions may have to be assigned to specific objects. In reasoning process. As this component interacts directly
addition to objects, the knowledge base provides rules, with the user, the success of the expert system greatly
presented in the form; IF premise THEN conclusion and/or depends on this component. Therefore, it is required by the
action. In the premise part, questions are asked about the user interface to be user friendly, to be able to prevent
logical links between the characteristics of the objects, erroneous inputs as much as possible, to supply the results in
whereas, in the conclusions part, new facts and character- an appropriate form to the user, and to provide under-
istics are added to the knowledge base and/or actions are standable questions and explanations.
executed. This is often referred to as rule-based program- Finally, the acquisition component provides support for
ming. When creating a knowledge base the following should the structuring and implementation of the knowledge in the
be considered by the knowledge engineer who works with knowledge base. Therefore, the work of the knowledge
the human expert/s: engineer gets considerable support from a good acquisition
component.
1. Objects to be defined.
2. The relationship between the objects.
3. The way that the rules will be formulated and processed,
3. Artificial neural networks
and
4. The completeness of the knowledge base with respect
to solving the specific problem. The concept of ANN analysis has been discovered nearly
50 years ago, but it is only in the last 20 years that
The inference mechanism contains the control methods applications software has been developed to handle practical
that indicate how the present knowledge is to be processed, problems. The history and theory of neural networks have
in order to obtain solutions and conclusions to the problem. been described in a large number of published literatures
The inference mechanism represents the logical unit by and will not be covered in this paper except for a very brief
means of which conclusions are drawn from the knowledge overview of how neural networks operate. The basic
base according to a defined problem-solving method, which features of some of the mostly used neural network
simulates the problem solving process of a human expert. It architectures would also be described.
reflects the way in which the system reasons on the ANNs are good for some tasks while lacking in some
acquired knowledge, and it is interrelated with the human others. Specifically, they are good for tasks involving
way of reasoning. The functions of the inference incomplete data sets, fuzzy or incomplete information, and
mechanism are [5]: for highly complex and ill-defined problems, where humans
usually decide on an intuitional basis. They can learn from
1. To determine which actions are to be executed between examples, and are able to deal with non-linear problems.
the individual parts of the expert system, how they are Furthermore, they exhibit robustness and fault tolerance. The
to be executed and in which sequence. tasks that ANNs cannot handle effectively are those requiring
2. To determine how and when the rules will be processed high accuracy and precision as in logic and arithmetic. ANNs
and, if applicable, to select which rules will be have been applied successfully in a number of application
processed. areas. Some of the most important ones are:
3. To control the dialog with the users.
1. Function approximation. Mapping of a multiple input to a
The rule-processing mechanism chosen are of primary single output is established. Unlike most statistical
importance in determining the performance of the entire techniques, this can be done with adaptive model-free
system. Different types of problems require different types estimation of parameters.
of inference mechanisms. Additionally, the inference 2. Pattern association and pattern recognition. This is a
mechanism must be adapted to the problem to be solved. problem of pattern classification. ANNs can be effectively
The explanation component explains the problem used to solve difficult problems in this field, like for
solving strategy to the user. The solutions determined instance in sound, image, or video recognition. This task
by the expert system must be reproducible, both by the can even be made without an a priori definition of the
knowledge engineer (during the test phase) and by the user. pattern. In such cases, the network learns to identify
It is advantageous to know at any point during the operation totally new patterns.
of the system how far the system progress in the processing 3. Associative memories. This is the problem of recalling a
of the problem. pattern when given only a subset clue. In such
520 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

applications, the network structures used are usually † There is limited theory to assist in the design of neural
complicated, composed of many interacting dynamical networks.
neurons. † There is no guarantee of finding an acceptable solution to
4. Generation of new meaningful patterns. This general field a problem.
of application is relatively new. Some claims are made † There are limited opportunities to rationalize the
that suitable neuronal structures can exhibit rudimentary solutions provided.
elements of creativity.
In the following sections it is briefly explained how from
ANNs have been applied successfully in a various fields a biological neuron the artificial one is visualized and the
of mathematics, engineering, medicine, economics, meteor- steps required to set-up a neural network. Additionally, the
ology, psychology, neurology, and many others. Some of characteristics of some of the mostly used neural network
the most important ones are: in pattern, sound and speech architectures are described.
recognition, in the analysis of electromyographs and other
medical signatures, in the identification of military targets 3.1. Biological and artificial neurons
and in the identification of explosives in passenger suitcases.
They have also being used in weather and market trends A biological neuron is shown in Fig. 1. In brain, there is a
forecasting, in the prediction of mineral exploration sites, in flow of coded information (using electrochemical media, the
electrical and thermal load prediction, in adaptive and so-called neurotransmitters) from the synapses towards the
robotic control and many others. Neural networks are also axon. The axon of each neuron transmits information to a
used for process control because they can build predictive number of other neurons. The neuron receives information
models of the process from multi-dimensional data routinely at the synapses from a large number of other neurons. It is
collected from sensors. estimated that each neuron may receive stimuli from as
Neural networks obviate the need to use complex many as 10,000 other neurons. Groups of neurons are
mathematically explicit formulas, computer models, and organized into sub-systems and the integration of these sub-
impractical and costly physical models. Some of the systems forms the brain. It is estimated that the human brain
characteristics that support the success of ANNs and has got around 100 billion interconnected neurons.
distinguish them from the conventional computational Fig. 2 shows a highly simplified model of an artificial
techniques are [9]: neuron, which may be used to stimulate some important
aspects of the real biological neuron. An ANN is a group of
† The direct manner in which ANNs acquire information interconnected artificial neurons, interacting with one
and knowledge about a given problem domain (learning another in a concerted manner. In such a system, excitation
interesting and possibly non-linear relationships) is applied to the input of the network. Following some
through the ‘training’ phase. suitable operation, it results in a desired output. At the
† Neural networks can work with numerical or analogue synapses, there is an accumulation of some potential, which
data that would be difficult to deal with by other means in the case of the artificial neurons is modeled as a
because of the form of the data or because there are so connection weight. These weights are continuously modi-
many variables. fied, based on suitable learning rules.
† Neural network analysis can be conceived of as a ‘black
box’ approach and the user does not require sophisticated 3.2. Artificial neural network principles
mathematical knowledge.
† The compact form in which the acquired information and According to Haykin [10] a neural network is a
knowledge is stored within the trained network and the massively parallel distributed processor that has a natural
ease with which it can be accessed and used. propensity for storing experiential knowledge and making it
† Neural network solutions can be robust even in the available for use. It resembles the human brain in two
presence of ‘noise’ in the input data. respects: the knowledge is acquired by the network through
† The high degree of accuracy reported when ANNs are used a learning process, and inter-neuron connection strengths
to generalize over a set of previously unseen data (not used known as synaptic weights are used to store the knowledge.
in the ‘training’ process) from the problem domain. ANN models may be used as an alternative method in
engineering analysis and predictions. ANNs mimic some-
While neural networks can be used to solve complex what the learning process of a human brain. They operate
problems they do suffer from a number of shortcomings. like a ‘black box’ model, requiring no detailed information
The most important of them are: about the system. Instead, they learn the relationship
between the input parameters and the controlled and
† The data used to train neural nets should contain uncontrolled variables by studying previously recorded
information, which ideally, is spread evenly throughout data, similar to the way a non-linear regression might
the entire range of the system. perform. Another advantage of using ANNs is their ability
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 521

Fig. 1. A simplified model of a biological neuron.

Fig. 2. A simplified model of an artificial neuron.

to handle large and complex systems with many interrelated fashion using a suitable learning method. The network uses a
parameters. They seem to simply ignore excess input learning mode, in which an input is presented to the network
parameters that are of minimal significance and concentrate along with the desired output and the weights are adjusted so
instead on the more important inputs. that the network attempts to produce the desired output. The
A schematic diagram of a typical multi-layer feedforward weights after training contain meaningful information
neural network architecture is shown in Fig. 3. The network whereas before training they are random and have no
usually consists of an input layer, some hidden layers and an meaning.
output layer. In its simple form, each single neuron is Fig. 4 shows how information is processed through a
connected to other neurons of a previous layer through single node. The node receives weighted activation of other
adaptable synaptic weights. Knowledge is usually stored as a nodes through its incoming connections. First, these are
set of connection weights (presumably corresponding to added up (summation). The result is then passed through an
synapse efficacy in biological neural systems). Training is the activation function; the outcome is the activation of the
process of modifying the connection weights in some orderly node. For each of the outgoing connections, this activation

Fig. 3. Schematic diagram of a multi-layer feedforward neural network.


522 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

Fig. 4. Information processing in a neural network unit.

value is multiplied with the specific weight and transferred achieved, among others, through synaptic weight modifi-
to the next node. cation, network structure modifications, through appropriate
A training set is a group of matched input and output choice of activation functions and others. A procedure for
patterns used for training the network, usually by suitable choosing the appropriate network parameters to facilitate
adaptation of the synaptic weights. The outputs are the learning is presented in Section 3.3.
dependent variables that the network produces for the By meaningful results, it is meant that a desired objective
corresponding input. It is important that all the information is met with a satisfactory degree of success. The objective is
the network needs to learn is supplied to the network as a usually quantified by a suitable criterion or cost function. It
data set. When each pattern is read, the network uses the is usually a process of minimizing an error function or
input data to produce an output, which is then compared to maximizing a benefit function. In this respect, learning
the training pattern, i.e. the correct or desired output. If there resembles optimization. That is why a GA, which is an
is a difference, the connection weights (usually but not optimum search technique (Section 4) can also be employed
always) are altered in such a direction that the error is to train ANNs.
decreased. After the network has run through all the input Several algorithms are commonly used to achieve the
patterns, if the error is still greater than the maximum minimum error in the shortest time. There are also many
desired tolerance, the ANN runs again through all the input alternative forms of neural networking systems and, indeed,
patterns repeatedly until all the errors are within the required many different ways in which they may be applied to a given
tolerance. When the training reaches a satisfactory level, the problem. The suitability of an appropriate paradigm and
network holds the weights constant and the trained network strategy for application is very much dependent on the type
can be used to make decisions, identify patterns, or define of problem to be solved.
associations in new input data sets not used to train it. The most popular learning algorithms are the back-
By learning, it is meant that the system adapts (usually propagation (BP) and its variants [1,13]. The BP algorithm
by changing suitable controllable parameters) in a specified is one of the most powerful learning algorithms in neural
manner so that, some parts of the system suggest a networks. The training of all patterns of a training data set is
meaningful behavior, projected as output. The controllable called an epoch. The training set has to be a representative
parameters have different names such as synaptic weights, collection of input –output examples. BP training is a
synaptic efficancies, free parameters and others. gradient descent algorithm. It tries to improve the
The classical view of learning is well interpreted and performance of the neural network by reducing the total
documented in approximation theories. In these, learning error by changing the weights along its gradient. The error is
may be interpreted as finding a suitable hypersurface that fits expressed by the root-mean-square value (RMS), which can
known input/output data points in such a manner that the be calculated by:
mapping is acceptably accurate. Such a mapping is usually 2 31=2
accomplished by employing simple non-linear functions 1 4X X 25
E¼ lt 2 oip l ð1Þ
that are used to compose the required function [11]. 2 p i ip
A more general approach of learning is adopted by
Haykin [10] in which, learning is a process by which the free where E is the RMS error, t the network output (target), and
parameters of a neural network are adapted through a o the desired output vectors over all pattern p: An error of
continuing process of simulation by the environment in zero would indicate that all the output patterns computed by
which the network is embedded. The type of learning is the ANN perfectly match the expected values and the
determined by the manner in which the parameter changes network is well trained. In brief, BP training is performed by
take place. initially assigning random values to the weight terms ðwij Þ1
An even more general approach is suggested by
Neocleous [12] in which, learning is achieved through any 1
The j subscript refers to a summation of all nodes in the previous
change, in any characteristic of a network, so that mean- layer of nodes and the i subscript refers to the node position in the
ingful results are achieved. Thus, learning could be present layer.
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 523

in all nodes. Each time a training pattern is presented to the mathematical function when the system is non-linear and
ANN, the activation for each node, api ; is computed. After there are parameters that vary with time due to several
the output of the layer is computed the error term, dpi ; for factors. The control program often lacks the capability to
each node is computed backwards through the network. This adapt to the parameter changes. Neural networks are used
error term is the product of the error function, E; and the to learn the behavior of the system and subsequently used to
derivative of the activation function and hence is a measure simulate and predict its behavior. In defining the neural
of the change in the network output produced by an network model, first the process and the process control
incremental change in the node weight values. For the constrains have to be understood and identified. Then the
output layer nodes and for the case of the logistic-sigmoid model is defined and validated.
activation, the error term is computed as: When using a neural network for prediction, the
following steps are crucial. First, a neural network needs
dpi ¼ ðtpi 2 api Þapi ð1 2 api Þ ð2Þ to be built to model the behavior of the process and the
For a node in a hidden layer: values of the output are predicted based on the model.
X Second, based on the neural network model obtained on the
dpi ¼ api ð1 2 api Þ dpk wkj ð3Þ first phase, the output of the model is simulated using
k different scenarios. Third the control variables are modified
In the latter expression, the k subscript indicates a such that to control and optimize the output.
summation over all nodes in the downstream layer (the layer When building the neural network model the process has
in the direction of the output layer). The j subscript indicates to be identified with respect to the input and output variables
the weight position in each node. Finally, the d and a terms that characterize the process. The inputs include measure-
for each node are used to compute an incremental change to ments of the physical dimensions, measurements of the
each weight term via: variables specific to the environment or equipment, and
Dwij ¼ 1ðdpi apj Þ þ mwij ðoldÞ ð4Þ controlled variables modified by the operator. Variables that
do not have any effect on the variation of the measured
The term 1 is referred to as the learning rate and output are discarded. These are estimated by the contri-
determines the size of the weight adjustments during each bution factors of the various input parameters. These factors
training iteration. The term m is called momentum factor. It indicate the contribution of each input parameter to the
is applied to the weight change used in the previous training learning of the neural network and are usually estimated by
iteration, wij ðoldÞ: Both of these constant terms are specified the network, depending on the software employed.
at the start of the training cycle and determine the speed and The selection of training data has a vital role in the
stability of the network. performance and convergence of the neural network model.
An analysis of historical data for identification of variables
3.3. Network parameters selection that are important to the process is important. Plotting graphs
to check whether the charts of the various variables reflect
While most scholars are concerned with the techniques what is known about the process from operating experience
to define ANN architecture, practitioners want to apply the and for discovery of errors in data is very helpful.
ANN architecture to the model and obtain quick results. The The input and output values are normalized. All input
neural network architecture refers to the arrangement of and output values are usually scaled individually such that
neurons into layers and the connection patterns between the overall variance in the data set is maximized. This is
layers, activation functions and learning methods. The necessary as it leads to faster learning. The scaling used is
neural network model and the architecture of a neural either in the range 21 to 1 or in the range 0 to 1 depending
network determine how a network transforms its input into on the type of data and the activation function used.
an output. This transformation is in fact a computation. The basic operation that has to be followed to successfully
Often the success depends upon a clear understanding of the handle a problem with ANNs, is to select the appropriate
problem regardless of the network architecture. However, in architecture and the suitable learning rate, momentum,
determining which neural network architecture provides the number of neurons in each hidden layer and the activation
best prediction it is necessary to build a good model. It is function. The procedure for finding the best architecture and
essential to be able to identify the most important variables the other network parameters is shown graphically in Fig. 5.
in a process and generate best-fit models. How to identify This is a laborious and time-consuming method but as
and define the best model it is very controversial. experience is gathered, some parameters can be predicted
Although there are differences between traditional easily, thus shortening tremendously the time required.
approaches and neural networks, both methods require The first step is to collect the required data and prepare
preparing the model. The classical approach is based on the them in a spreadsheet format with various columns
precise definition of the problem domain as well as the representing the input and output parameters. If a large
identification of a mathematical function or functions to number of sequences/patterns are available in the input data
describe it. It is however very difficult to identify an accurate file, to avoid long training times a smaller training file may
524 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

Fig. 5. Network parameters selection procedure.


S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 525

be created, containing as much as possible representative a matter of trial and error, since there is no science to it. In
samples of the whole problem domain, in order to select the general the number of hidden neurons ðNÞ may be estimated
required parameters and use the complete data set for the by applying the following empirical formula [14]
final training. I þ O pffiffiffi
N¼ þ Pi ð5Þ
Three types of data files are required: a training data file, 2
a test data file and a validation data file. The former and the where I is the number of input parameters, O is the number of
latter should contain representative samples of all the cases output parameters and Pi is the number of training patterns
the network is required to handle, whereas the test file may available.
contain about 10% of the cases contained in the training file.
During training, the network is tested against the test file 3.4. Artificial neural network architectures
to determine accuracy and training should be stopped when
the mean average error remains unchanged for a number of There are a number of architectures that have been
epochs. applied by various researchers. A short description of the
This is done in order to avoid overtraining, in which case, most important ones, suitable for engineering systems, is
the network learns perfectly the training patterns but is given in this section. These include BP, general regression
unable to make predictions when an unknown training set is neural networks (GRNN) and group method of data
presented to it. handling (GMDH) architectures.
In BP networks, the number of hidden neurons determines Architectures in the BP category include standard
how well a problem can be learned. If too many are used, the networks, recurrent, feedforward with multiple hidden
network will tend to try to memorize the problem, and thus slabs and jump connection networks. The various alternative
not generalize well later. If too few are used, the network will BP architectures that can be considered are shown in Fig. 6
generalize well but may not have enough ‘power’ to learn the [14]. BP networks are known for their ability to genera-
patterns well. Getting the right number of hidden neurons is lize well on a wide variety of problems. BP networks are

Fig. 6. Alternate backpropagation neural network architectures.


526 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

a supervised type of networks, i.e. trained with both inputs memory. The long-term memory remembers the hidden layer,
and outputs. BP networks are used in a large number of which contains features detected in the raw data of previous
working applications as they tend to generalize well. patterns. Recurrent neural networks are particularly suitable
The first category of neural network architectures is the for prediction of sequences so they are excellent for time
one where each layer connected to the immediately previous series data. One such application for the prediction of the
layer. Generally, three layers (input, hidden, and output energy consumption of a passive solar building is shown in
layer) are sufficient for the majority of problems to be Ref. [15]. A BP network with standard connections responds
handled. Three layers BP network with standard connec- to a given input pattern with exactly the same output pattern
tions is suitable for almost all problems. One, two or three every time the input pattern is presented. A recurrent network
hidden layers architecture can be used however, depending may respond to the same input pattern differently at different
on the problem characteristics. Use of more than five layers times, depending upon the patterns that have been presented
in total generally offers no benefit and should be avoided. as inputs just previously. Thus, the sequence of the patterns is
The next category of architectures is the recurrent with as important as the input pattern itself. Recurrent networks are
dampened feedback from either the input, hidden, or output trained the same as standard BP networks except that patterns
layer. This type of architecture is excellent for time series must always be presented in the same order, i.e. random
data. More details are given in Section 3.4.1. selection is not allowed. The difference in structure is that
The third category is the feedforward with multiple there is one extra slab in the input layer that is connected to the
hidden slabs. These network architectures are very powerful hidden layer just like the other input slab. This extra slab holds
to detect different features of the input vectors when the contents of one of the layers (the hidden layer in this case)
different activation functions are given to the hidden slabs. as it existed when the previous pattern was trained. In this way
More details are given in Section 3.4.2. the network sees previous knowledge it had about previous
The last category of neural network architectures is the inputs.
one where each layer is connected to every previous layer. The activation function that may be used for each slab is
These network architectures are found to be useful when also shown in Fig. 7. The activation function used in the
working with very complex patterns, i.e. when it may be input slab is linear (i.e. of the form y ¼ x) whereas in the
very difficult for a human to define the different patterns that hidden and output slabs is of the sigmoid form given by
are inherent in the data. the logistic form:
1
3.4.1. Recurrent type architecture y¼ ð6Þ
1 þ e2x
This architecture is shown schematically in Fig. 7. Fig. 7
actually shows in detail the recurrent architecture shown in the The inputs correspond to the values of the input
middle of second row in Fig. 6. It is composed of four slabs: parameters used in a problem. The learning procedure in
one of which is hidden and one is used for dampened this network may be implemented by using the BP
feedback. The extra slab is connected to the hidden layer. This algorithm. The learning rate, the initial value of the weights
architecture is commonly called ‘Jordan Elman recurrent and the momentum factor need to be specified by the user at
network’. It holds the contents of one of the layers as it existed the beginning. In BP networks, the number of hidden
when the previous pattern was trained. In this way the network neurons determines how well a problem can be learned. If
sees previous knowledge it had about previous inputs. This too many are used, the network will tend to try to memorise
extra slab is sometimes called the network’s ‘long-term’ the problem, and thus not generalize well later. If too few are

Fig. 7. Recurrent type architecture.


S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 527

used, the network will generalize well but may not have Gaussian complement for slab 4: 2
enough ‘power’ to learn the patterns well. Getting the right aðpiÞ ¼ 1 2 e2bi ð10Þ
number of hidden neurons is a matter of trial and error, since
there is no science to it. A way to overcome this difficulty is
to find the number of hidden neurons by applying Eq. (5). Logistic for output slab: Similar to Eq. (6) but in the new
notation:
1
3.4.2. Feedforward with multiple hidden slabs architecture aðpiÞ ¼ ð11Þ
1 þ e2bi
This architecture has been used in a number of
engineering problems for modeling and prediction with Different activation functions are applied to hidden layer
very good results [16 – 19]. This is a feedforward architec- slabs in order to detect different features in a pattern
ture, which as shown in Fig. 8, has three hidden slabs. Fig. 8 processed through a network. The number of hidden neurons
actually shows in detail the feedforward architecture shown in the hidden layers may also be calculated with Eq. (5).
in the middle of third row in Fig. 6. The information However, an increased number of hidden neurons may be
processing at each node site is performed by combining all used in order to get more ‘degrees of freedom’ and allow the
input numerical information from upstream nodes in a network to store more complex patterns. This is usually
weighted average of the form: done when the input data are highly non-linear. It is
X recommended in this architecture to use Gaussian function
bi ¼ wij apj þ b1 ð7Þ on one hidden slab to detect features in the mid-range of the
j
where aðpiÞ is the activation for each node and b1 is a data and Gaussian complement in another hidden slab to
constant term referred to as the bias. detect features from the upper and lower extremes of the
The final nodal output is computed via the activation data. Combining the two feature sets in the output layer may
function. This architecture has different activation functions lead to a better prediction.
in each slab. By referring to Fig. 8, the input slab activation
function is linear, i.e. aðpiÞ ¼ bi (where bi is the weighted 3.4.3. General regression neural network (GRNN)
average obtained by combining all input numerical architecture
information from upstream nodes), while the activations GRNN are known for the ability to train quickly on
used in the other slabs are: sparse data sets. In numerous tests, it was found that GRNN
responds much better than BP to many types of problems,
Gaussian for slab 2: although this is not a rule. It is especially useful for
2
aðpiÞ ¼ e2bi ð8Þ continuous function approximation. GRNN can have multi-
dimensional input, and it will fit multi-dimensional surfaces
Tanh for slab 3: through data. GRNN work by measuring how far a given
aðpiÞ ¼ tanhðbi Þ ð9Þ sample pattern is from patterns in the training set in N

Fig. 8. Feedforward with multiple hidden slabs architecture.


528 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

Fig. 9. General regression neural network architecture.

dimensional space, where N is the number of inputs in the patterns in the training set because the hidden layer consists
problem. The Euclidean distance is usually adopted. of one neuron for each pattern in the training set. This
A GRNN is a four-layer feedforward neural network number can be made larger if one may want to add more
based on the non-linear regression theory consisting of the patterns, but it cannot be made smaller. The number of
input layer, the pattern layer, the summation layer and the neurons in the input layer is equal to the number of input
output layer (Fig. 9). There are no training parameters such parameters. The number of neurons in the output layer
as learning rate and momentum as there are in BP net- corresponds to the number of outputs.
works, but there is a smoothing factor that is applied after The training of the GRNN is quite different from the
the network is trained. The smoothing factor determines training used in other neural networks. It is completed after
how tightly the network matches its predictions to the data presentation of each input – output vector pair from the
in the training patterns. While the neurons in the first three training data set to the GRNN input layer only once.
layers are fully connected, each output neuron is connected The GRNN may be trained using a GA (Section 4). The
only to some processing units in the summation layer. The GA is used to find the appropriate individual smoothing
summation layer has two different types of processing units, factors for each input as well as an overall smoothing factor.
the summation units and a single division unit. The number GAs use a ‘fitness’ measure to determine which of the
of the summation units is always the same as the number of individuals in the population survive and reproduce. Thus,
the GRNN output units. The division unit only sums the survival of the fittest causes good solutions to progress. A
weighted activations of the pattern units of the hidden layer, GA works by selective breeding of a population of
without using any activation function. Each of the GRNN ‘individuals’, each of which could be a potential solution
to the problem. In this case, a potential solution is a set of
output units is connected only to its corresponding
smoothing factors, and the GA is seeking to breed an
summation unit and to the division unit (there are no
individual that minimizes the mean squared error of the test
weights in these connections). The function of the output
set, which can be calculated by
units consists in a simple division of the signal coming from
the summation unit by the signal coming from the division 1X
unit. The summation and output layers together basically E¼ ðt 2 op Þ2 ð12Þ
p p p
perform a normalization of the output vector, thus making
GRNN much less sensitive to the proper choice of the where E is the mean squared error, t the network output
number of pattern units. More details on GRNN can be (target), and o the desired output vectors over all pattern p of
found in Refs. [20,21]. the test set.
If more than 2000 patterns are available in the training The larger the breeding pool size, the greater the
data set, then GRNN may become too slow to be feasible potential of it producing a better individual. However, the
unless a very fast machine is available. The reason is that networks produced by every individual must be applied to
applying a GRNN network requires a comparison between the test set on every reproductive cycle, so larger breeding
the new pattern and each of the training patterns. pools take longer time. After testing all of the individuals in
For GRNN networks, the number of neurons in the the pool, a new ‘generation’ of individuals is produced for
hidden pattern layer is usually equal to the number of testing. Unlike BP algorithm, which propagates the error
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 529

through the network many times seeking a lower mean and many other modeling applications [25]. GMDH is a
squared error between the network’s output and the actual feature-based mapping network.
output or answer, GRNN training patterns are only GMDH works by building successive layers with links
presented to the network once. that are simple polynomial terms. These polynomial terms
The input smoothing factor is an adjustment used to are created by using linear and non-linear regression. The
modify the overall smoothing to provide a new value for each initial layer is simply the input layer. The first layer created
input. At the end of training, the individual smoothing factors is made by computing regressions of the input variables
may be used as a sensitivity analysis tool; the larger the factor from which the best ones are chosen. The second layer is
for a given input, the more important that input is to the model, created by computing regressions of the values in the first
at least as far as the test set is concerned. Inputs with low layer along with the input variables. Only the best are
smoothing factors are candidates for removal for a later trial. chosen by the algorithm called survivors. This process
Individual smoothing factors are unique to each network. continues until the network stops getting better, according to
The numbers are relative to each other within a given a prespecified selection criterion. More details on GMDH
network and they cannot be used to compare inputs from can be found in Ref. [25].
different networks. The resulting network can be represented as a complex
If the number of input, output, or hidden neurons, is polynomial description of the model which is in the form of
changed however, the network must be retrained. This may a mathematical equation. The complexity of the resulting
occur when more training patterns are added because GRNN polynomial depends on the variability of the training data. In
networks require one hidden neuron for each training pattern. some respects GMDH, it is very much like using regression
All the data sets used, as in all neural networks, are analysis, but it is far more powerful than regression analysis.
scaled from their numeric range into the numeric range that GMDH can build very complex models while avoiding
the neural network deals with efficiently. For the GRNN an overfitting problems. Additionally, an advantage of GMDH
activation function is only required in the input layer slab. is that it recognizes the best variables as it trains and for
problems with many variables, the ones with low contri-
bution can be discarded.
3.4.4. Group method data handling neural network A typical GMDH architecture is shown in Fig. 10. The
(GMDH) architecture GMDH network is not like regular feedforward networks and
One type of neural networks, which is very suitable was not originally represented as a neural network. The
for modeling, is the GMDH neural network. The GMDH GMDH network is implemented with polynomial terms in
technique was invented by A.G. Ivakhnenko from the the links and may be used with a genetic component to decide
Institute of Cybernetics, Ukrainian Academy of Sciences how many layers are built. The result of training at the output
[22,23], but enhanced by others [24]. This technique is layer can be represented as a polynomial function of all or
also known as ‘polynomial networks’. Ivakhenenko some of the inputs (for problems with many inputs some of
developed GMDH for the purpose of building more them may be discarded depending on their contribution).
accurate predictive models of fish populations in rivers With reference to Fig. 10, the network maps a vector input x
and oceans. GMDH worked well for modeling fisheries to a scalar output y0 : Each processing element has two inputs,

Fig. 10. A typical GMDH neural network architecture.


530 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

which are combined in a general quadratic form with six To avoid this danger, a more powerful criterion is
weights to yield the output signal. Layers in these units are needed, which is based upon information other than that
combined to implement the mapping. which was used to build the evaluated model. There are
The central idea behind GMDH is that it is trying to build several ways to define such criteria. For example, the
a function (called a polynomial model), which would squared sum of differences between the known output and
behave in such a way that the predicted and actual values of model prediction over some other set of experimental data (a
the output would be as close as possible. For many end users test set) may be used. Another way to avoid overfitting is to
it may be more convenient to have a model, which is able to introduce a penalty for model complexity. This is called the
make predictions using polynomial formulas that are widely predicted squared error criterion.
understood instead of a normal neural network, which Theoretical considerations show that increasing model
operates like a ‘black box’ model. The most common complexity should be stopped when the selection criterion
approach to solving such models is to use regression reaches a minimum value. This minimum value is a measure
analysis. The first step is to decide the type of polynomial of model reliability.
that regression should find. For example, a good idea is to The method of searching for the best model based upon
choose powers of input variables along with their covariants testing all possible models is usually called the combinatorial
and trivariants as terms of the polynomial like: GMDH algorithm. In order to reduce computation time, the
{x1 ; x2 ; x3 ; …; x21 ; x22 ; x23 ; …; x1 x2 ; x1 x3 ; …; xn21 xn ; x1 x2 x3 ; …} number of polynomial terms, which are used to build the
ð13Þ models to be evaluated, should be reduced. To do so, a one-
stage procedure of model selection should be changed to a
The next step is to construct a linear combination of all of multi-layer procedure. This is done as follows.
the polynomial terms with variable coefficients. The The first two input variables are initially taken and
algorithm determines the values of these coefficients by combined into a simple set of polynomial terms. For
minimizing the squared sum of differences between sample example, if the first two input variables are x1 and x2 ; the set
outputs and model predictions, over all samples. of polynomial terms would be {c; x1 ; x2 ; x1 p x2 } ðcÞ
The main problem when utilizing regression is how to represents the constant term. Subsequently, all possible
choose the set of polynomial terms correctly. In addition, models made from these terms are checked, and one which
decisions need to be made on the degree of the polynomial. is the best is chosen, anyone of the evaluated models is a
For example, decisions have to be made on how complex the candidate for survival.
terms should be or whether the model should evaluate terms Then, another pair of input variables is taken and the
such as x10 ; or maybe limit consideration to terms such as x4 operation is repeated, resulting in another candidate for
and lower. GMDH works better than regression by survival, with its own value of the evaluation criterion.
answering these questions before trying all possible By repeating the same procedure for each possible pair
combinations. of n input variables, nðn 2 1Þ=2 candidates for survival
The decision about the quality of each model must be are generated, each with its own value of the evaluation
made using some numeric criterion. The simplest criterion (a criterion.
form of which is also used in linear regression analysis) is the Then these values are compared and several candidates
sum, over all samples, of the squared differences between the for survival are chosen which give the best approximation
actual output ðya Þ and the model’s prediction ðyp Þ divided by of the output variable. Usually a pre-defined number of
the sum of the squared actual. This is called the normalized
the best candidates are selected for survival, which are
mean squared error (NMSE). In equation form:
stored in the first layer of the network and are preserved
XN for the next layer. The candidates that are selected are
ðya 2 yp Þ2
i¼1 called ‘survivors’.
NMSE ¼ XN 2 ð14Þ
i¼1 ya
The layer of survivors is used for inputs in building the
next layer in the network. The original network inputs used
However, if only the NMSE is used on real data, the in the first layer may also be chosen as inputs to the new
NMSE value gets smaller and smaller as long as extra layer. Therefore, the next layer is built with polynomials of
terms are added to the model. This is because the more this broadened set of inputs. It should be noted that since
complex the model, the more exact it is. This is always some inputs are already polynomials, the next layer may
true if NMSE is used alone, which determines the quality contain very complex polynomials.
of the model by evaluating the same information already The layer building of GMDH procedure continues as
used to build the model. This result in an ‘overcomplex’ long as the evaluation criteria continue to diminish.
model or model overfit, which means the model, does not Each time a new layer is built the GMDH algorithm
generalize well because it pays too much attention to checks if the new evaluation criterion is lower than the
noise in the training data. This is similar to overtraining previous one and if this is so continues training otherwise
other neural networks. it stops training.
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 531

4. Genetic algorithms which certain randomly selected individuals in a sub-group


compete and the fittest is selected. This is called tournament
The GA is a model of machine learning, which derives selection. The two processes that most contribute to
its behavior from a representation of the processes of evolution are crossover and fitness based selection/repro-
evolution in nature. This is done by the creation within a duction. Mutation also plays a role in this process.
machine/computer of a population of individuals rep- GAs are used for a number of different application areas.
resented by chromosomes. Essentially these are a set of An example of this would be multi-dimensional optimiz-
character strings that are analogous to the chromosomes that ation problems in which the character string of the
we see in the DNA of human beings. The individuals in the chromosome can be used to encode the values for the
population then go through a process of evolution. different parameters being optimized.
It should be noted that evolution as occurring in nature or In practice, therefore, this genetic model of computation
elsewhere is not a purposive or directed process, i.e. there is can be implemented by having arrays of bits or characters to
no evidence to support the assertion that the goal of represent the chromosomes. Simple bit manipulation
evolution is to produce mankind. Indeed, the processes of operations allow the implementation of crossover, mutation
nature seem to end to different individuals competing for and other operations.
resources in the environment. Some are better than others When the GA is executed, it is usually done in a manner
are, those that are better are more likely to survive and that involves the following cycle. Evaluate the fitness of all
propagate their genetic material. of the individuals in the population. Create a new population
In nature, the encoding for the genetic information is by performing operations such as crossover, fitness-propor-
done in a way that admits asexual reproduction typically tionate reproduction and mutation on the individuals whose
results in offspring that are genetically identical to the fitness has just been measured. Discard the old population
parent. Sexual reproduction allows the creation of geneti- and iterate using the new population. One iteration of this
cally radically different offspring that are still of the same loop is referred to as a generation. The structure of the
general species. standard GA is shown in Fig. 11 [26].
In an over simplified consideration, at the molecular With reference to Fig. 11, in each generation individuals
level what happens is that a pair of chromosomes bump into are selected for reproduction according to their performance
one another, exchange chunks of genetic information and with respect to the fitness function. In essence, selection
drift apart. This is the recombination operation, which in gives a higher chance of survival to better individuals.
GAs is generally referred to as crossover because of the way Subsequently, genetic operations are applied in order to
that genetic material crosses over from one chromosome to form new and possibly better offspring. The algorithm is
another. terminated either after a certain number of generations or
The crossover operation happens in an environment when the optimal solution has been found. More details on
where the selection of who gets to mate is a function of the GAs can be found in Refs. [27 – 29].
fitness of the individual, i.e. how good the individual is at The first generation (generation 0) of this process
competing in its environment. Some GAs use a simple operates on a population of randomly generated individuals.
function of the fitness measure to select individuals From there on, the genetic operations, in concert with the
(probabilistically) to undergo genetic operations such as fitness measure, operate to improve the population.
crossover or asexual reproduction, i.e. the propagation of During each step in the reproduction process, the
genetic material remains unaltered. This is fitness-propor- individuals in the current generation are evaluated by a
tionate selection. Other implementations use a model in fitness function value, which is a measure of how well

Fig. 11. The structure of standard genetic algorithm.


532 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

the individual solves the problem. Then each individual is


reproduced in proportion to its fitness; the higher the fitness,
the higher its chance to participate in mating (crossover) and
to produce an offspring. A small number of newborn
Fig. 12. Continuous and enumerated chromosomes.
offspring undergo the action of the mutation operator. After
many generations, only those individuals who have the best Crossover rate. Crossover rate determines the prob-
genetics (from the point of view of the fitness function) ability that the crossover operator will be applied to a
survive. The individuals that emerge from this ‘survival of particular chromosome during a generation. This parameter
the fittest’ process are the ones that represent the optimal should be near 90%.
solution to the problem specified by the fitness function and Mutation rate. Mutation rate determines the probability
the constraints. that the mutation operator will be applied to a particular
GA are suitable for finding the optimum solution in chromosome during a generation. This parameter should be
problems were a fitness function is present. GAs use a fitness very small, near 1%.
measure to determine which of the individuals in the Generation gap. Generation gap determines the fraction
population survive and reproduce. Thus, survival of the of those individuals that do not go into the next generation.
fittest causes good solutions to progress. A GA works by
It is sometimes desirable that individuals in the population
selective breeding of a population of individuals, each of
be allowed to go into next generation. This is especially
which could be a potential solution to the problem. The GA
important if individuals selected are the most fit ones in the
is seeking to breed an individual, which either maximizes,
population. This parameter should be near 95%.
minimizes or is focused on a particular solution of a
Chromosome type. Populations are composed of indi-
problem.
viduals, and individuals are composed of chromosomes,
The larger the breeding pool size, the greater the
which are equivalent to variables. Chromosomes are
potential of it producing a better individual. However, as
composed of smaller units called genes. There are two
the fitness value produced by every individual must be
types of chromosomes, continuous and enumerated as
compared with all other fitness values of all other
shown in Fig. 12.
individuals on every reproductive cycle, larger breeding
Continuous chromosomes are implemented in the
pools take longer time. After testing all of the individuals
computer as binary bits. The two distinct values of a gene,
in the pool, a new ‘generation’ of individuals is produced
for testing. 0 and 1, are called alleles. Multiple chromosomes make up
During the setting up of the GA the user has to specify the the individual. Each partition is one chromosome, each
adjustable chromosomes, i.e. the parameters that would be binary bit is a gene, and the value of each bit (1, 0, 0, 1, 1, 0)
modified during evolution to obtain the maximum value of is an allele. The genes in a chromosome can take on a wide
the fitness function. Additionally, the user has to specify the range of values between the minimum and maximum values
ranges of these values called constraints. of the associated variables. One variation of continuous
A GA is not gradient based, and uses an implicitly chromosomes is the ‘integer chromosomes’ which are used
parallel sampling of the solutions space. The population in problems that they require to take only integer values of
approach and multiple sampling means that it is less subject chromosomes and genes.
to becoming trapped to local minima than traditional direct Enumerated chromosomes consist of genes, which can
approaches, and can navigate a large solution space with a have more allele values than just 0 and 1, and these values
highly efficient number of samples. Although not guaran- are usually visible to the user. These are suitable for a
teed to provide the globally optimum solution, the GAs have category of problems, usually called combinatorial pro-
been shown to be highly efficient at reaching a very near blems, of which the traveling salesman problem (TSP) is the
optimum solution in a computationally efficient manner. most famous example. In the TSP, a salesman must visit
The GA is usually stopped after best fitness remained some number of cities, say 6, and he only wants to visit each
unchanged for a number of generations or when the city once. In this problem, the objective is to find the shortest
optimum solution is reached. route through all cities in order to minimize his traveling
The GA parameters to be specified by the user are as expenses. The values of chromosomes range from a
follows: minimum to a maximum value, and the order of their
Population size. Population size is the size of the genetic genes in the chromosome is important. In the TSP example
breeding pool, i.e. the number of individuals contained in with six cities, there will be six genes, each with values
the pool. If this parameter is set to a low value, there would ranging from 1 to 6. The entire chromosome represents the
not be enough different kinds of individuals to solve the path through the cities, therefore each individual has only
problem satisfactorily. On the other hand, if there are too one chromosome, comprised of the entire list of cities. For
many in the population, a good solution will take longer to example, suppose a particular chromosome in the popu-
be found because the fitness function must be calculated for lation is 3, 4, 5, 2, 1, 6. This would represent the solution
every individual in every generation. where the salesman will go first to city 3, then to city 4, and
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 533

so on. Each partition (city) is a gene, and each value (3, 4, 5, † Their development is cheaper than that of a model-based
2, 1, 6,) is an allele. or other controller to do the same thing.
It should be noted that the chromosomes in the TSP † They are customizable since it is easier to understand and
problem should not contain duplicate genes, i.e. genes with modify their rules and also are expressed in natural
the same allele values, or else the salesman may travel to the linguistic terms.
same city more than once. Usually two different types of † It is easy to learn how these controllers operate and how
enumerated chromosomes are provided, ‘repeating genes’ to design and apply them in an application.
and ‘unique genes’. For the TSP, unique genes have to be † They can model non-linear functions of arbitrary
used but some other problems may require repeating genes complexity.
where chromosomes can be 2, 3, 2, 4, 5, 2, 3 or even 3, 3, 3, † It can be built on top of the experience of experts.
3, 3, 3, 3. † It can be blended with conventional control techniques.

Fuzzy control should not be used when conventional


5. Fuzzy logic control theory yields a satisfactory result and when an
adequate and solvable mathematical model already exists or
Fuzzy logic is a logical system, which is an extension of can easily be created.
multi-valued logic. Additionally, fuzzy logic is almost Fuzzy logic was initialed in 1965 in the States by
synonymous with the theory of fuzzy sets, a theory that Professor Lofti Zateh [31]. In fact, Zadeh’s theory not only
relates to classes of objects without sharp boundaries in offered a theoretical basis for fuzzy control, but also
which membership is a matter of degree. Fuzzy logic is all establishes a bridge connecting AI to control engineering.
about the relative importance of precision, i.e. how important Fuzzy logic has emerged as a tool for controlling industrial
is to be exactly right when a rough answer will work. Fuzzy processes, as well as household and entertainment elec-
inference systems have been successfully applied in fields tronics, diagnosis systems and other expert systems. Fuzzy
such as automatic control, data classification, decision logic is basically a multi-valued logic that allows inter-
analysis, expert systems and computer vision. Fuzzy logic mediate values to be defined between conventional
is a convenient way to map an input space to an output space, evaluations like yes/no, true/false, black/white, large/
as for example, according to how hot the water is required small, etc. Notions like ‘rather warm’ or ‘pretty cold’ can
adjust the valve to the right setting, or according to the steam be formulated mathematically and processed in computers.
outlet temperature required adjust the fuel flow is a boiler. Thus, an attempt is made to apply a more human-like way of
From these two examples it can be understood that fuzzy thinking in the programming of computers.
logic mainly has to do with the design of controllers. A fuzzy controller design process contains the same
Conventional control is based on the derivation of a steps as any other design process. One needs initially to
mathematical model of the plant from which a mathematical choose the structure and parameters of a fuzzy controller,
model of a controller can be obtained. When a mathematical test a model or the controller itself and change the structure
model cannot be created then there is no way through and/or parameters based on the test results [30]. A basic
classical control to develop a controller. Other limitations of requirement for implementing fuzzy control is the avail-
conventional control are [30]: ability of a control expert who provides the necessary
knowledge for the control problem [32]. More details on
† Plant non-linearity. Non-linear models are computation- fuzzy control and practical applications can be found in
ally intensive and have complex stability problems. Refs. [30 – 35].
† Plant uncertainty. Accurate models cannot be created The linguistic description of the dynamic characteristics
due to uncertainty and lack of perfect knowledge. of a controlled process can be interpreted as a fuzzy model
† Multi-variables, multi-loops and environmental con- of the process. In addition to the knowledge of a human
strains. Multi-variable and multi-loop systems have expert, a set of fuzzy control rules can also be derived by
complex constrains and dependencies. using experimental knowledge. A fuzzy controller avoids
† Uncertainty in measurements due to noise. rigorous mathematical models and is consequently more
† Temporal behaviors. Plants, controllers, environments robust than a classical approach in cases which cannot be or
and their constrains vary with time. Additionally, time are with great difficulties precisely modeled mathematically.
delays are difficult to model. Fuzzy rules serve to describe in linguistic terms a
quantitative relationship between two or more variables.
The advantages of fuzzy control are [30]: Processing of the fuzzy rules provides a mechanism for
using them to compute the response to a given fuzzy
† Fuzzy controllers are more robust than PID controllers as controller input.
they can cover a much wider range of operating The basis of a fuzzy or any fuzzy rule system is the
conditions and can operate with noise and disturbances inference engine responsible for the inputs fuzzification,
of different natures. fuzzy processing and defuzzification of the output.
534 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

Fig. 13. Operation of a fuzzy controller.

Fig. 14. Basic configuration of fuzzy logic controller.

A schematic of the inference engine is shown in Fig. 13. 3. Design the rule base.
Fuzzification means that the actual inputs are fuzzified and 4. Design the computational unit. Many readymade pro-
fuzzy inputs are obtained. Fuzzy processing means that the grams are available for this purpose.
inputs are processed according to the rules set and produces 5. Determine the rules for defuzzification, i.e. to transform
fuzzy outputs. Defuzzification means to produce a crisp real fuzzy control output to crisp control action.
value for a fuzzy output which is also the controller output.
The fuzzy logic controller’s (FLC) goal is to achieve a
5.1. Membership functions
satisfactory control of a process. Based on the input
parameters the operation of the controller (output) can be
determined. The typical design scheme of a FLC is shown in A membership function is a curve that defines how each
Fig. 14 [30]. The design of such a controller contains the point in the input space is mapped to a membership value, or
degree of membership, between 0 and 1. In literature the
following steps:
input space is sometimes referred to as the universe of
discourse. The only condition a membership function must
1. Define the inputs and the control variables. really satisfy is that it must vary between 0 and 1.
2. Define the condition interface. Inputs are expressed as Additionally, it is possible in a fuzzy set to have a partial
fuzzy sets. membership, e.g. the weather is rather hot. The function

Fig. 15. Membership functions for linguistic variables describing an input sensor.
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 535

Fig. 16. Membership functions for linguistic variables describing motor operation.

itself can be an arbitrary curve whose shape can be defined The intersection of two fuzzy sets A and B is specified in
as a function that suits the particular problem from the point general by a binary mapping T; which aggregates two
of view of simplicity, convenience, speed and efficiency. membership functions as:
Based on signals usually obtained from sensors and mA>B ðxÞ ¼ TðmA ðxÞ; mB ðxÞÞ ð15Þ
common knowledge, membership functions for the input
The binary operator T may represent the multiplication
and output variables need to be defined. The inputs are
of mA ðxÞ and mB ðxÞ: These fuzzy intersection operators are
described in terms of linguistic variables as, for example,
usually refined as T-norm (triangular norm) operators.
Very High, High, Okay, Low and Very Low as shown in
Similarly fuzzy intersection, the fuzzy union operator is
Fig. 15. It should be noted that depending on the problem,
specified in general by a binary mapping S as:
different sensors could be used showing different parameters
like distance, angle, resistance, slope, etc. mA<B ðxÞ ¼ SðmA ðxÞ; mB ðxÞÞ ð16Þ
Similarly, the output can be adjusted in a similar way
The binary operator S may represent the addition of
according to some membership functions as, for example,
mA ðxÞ and mB ðxÞ: These fuzzy union operators are usually
the ones presented in Fig. 16. In both cases membership
referred to as T-conorm (or S-norm) operators.
curves other than the triangular can be used such as,
trapezoidal, quadratic, Gaussian (exponential), cos-function
5.3. If – then rules
and many others.
Fuzzy sets and fuzzy operators are the subjects and verbs
5.2. Logical operations of fuzzy logic. While the differential equations are the
language of conventional control, IF– THEN rules, which
The most important thing to realize about fuzzy logical determine the way a process is controlled, are the language of
reasoning is that it is a superset of standard Boolean logic, i.e. fuzzy control. Fuzzy rules serve to describe the quantitative
if the fuzzy values are kept at their extremes of 1 (completely relationship between variables in linguistic terms. These if –
true) and 0 (completely false), standard logical operations then rule statements are used to formulate the conditional
will hold. In the fuzzy logic however, the truth of any statements that comprise fuzzy logic. Several rule bases of
statement is a matter of degree. The input values can be real different complexity can be developed such as:
numbers between 0 and 1. It should be noted that the results of
the statement A AND B, where A and B are limited to the IF Sensor #1 is Very Low AND Sensor #2 is Very Low
range (0,1) can be resolved by using minðA; BÞ: Similarly OR THEN Motor is Fast Reverse
operation can be replaced with the max function so that A OR IF Sensor #1 is High AND Sensor #2 is Low THEN Motor
B becomes equivalent to maxðA; BÞ; and the operation NOT A is Slow Reverse
is equivalent to the operation 1 2 A: Given these three IF Sensor #1 is Okay AND Sensor #2 is Okay THEN
functions any construction can be resolved using fuzzy sets Motor Off
and the fuzzy logical operation AND, OR and NOT. An IF Sensor #1 is Low AND Sensor #2 is High THEN Motor
example of the operations on fuzzy sets are shown in Fig. 17. is Slow Forward
In Fig. 17 only one particular correspondence between IF Sensor #1 is Very Low AND Sensor #2 is Very High
two-valued and multi-valued logical operations for AND, THEN Motor is Fast Forward
OR and NOT is defined. This correspondence is by no means
unique. In more general terms, what are known as the fuzzy In general form a single fuzzy IF– THEN rule is of the
intersection or conjunction (AND), fuzzy union or disjunc- form
tion (OR), and fuzzy complement (NOT) can be defined. IF x is A and y is B THEN z is C ð17Þ
536 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

Fig. 17. Operations on fuzzy sets.

where A, B and C are linguistic values defined by fuzzy sets Interpreting an if –then rule involves two distinct parts as
on the ranges (universe of discourse) X, Y and Z, following:
respectively. In IF – THEN rules, the term following the IF
statement is called the premise or antecedent and the term 1. Evaluate the antecedent, which involves fuzzifying
following THEN is called the consequent. the input and applying any necessary fuzzy operators, and
It should be noted that A and B are represented as a 2. Apply that result to the consequent, known as implication.
number between 0 and 1, and so the antecedent is an
interpretation that returns a single number between 0 and 1. In the case of two-valued or binary logic, if – then rules do
On the other hand C is represented as a fuzzy set, and so not present much difficulty. If the premise is true, then the
the consequent is an assignment that assigns the entire fuzzy conclusion is true. In the case of a fuzzy statement, if
set C to the output variable z: In the if – then rule the word the antecedent is true to some degree of membership, then the
‘is’ gets used in two entirely different ways depending on consequent is also true to that same degree, that is:
whether it appears in the antecedent or the consequent. In
general the input to an if – then rule is the current value of an In binary logic: p ! q (p and p are either both true or both
input variable (in Eq. (17), x and y) and the output is an false)
entirely fuzzy set (in Eq. (17), z). This will later be In fuzzy logic: 0:5p ! 0:5q (partial antecedents provide
defuzzified, assigning one value to the output. partial implication)
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 537

It should be noted that both the antecedent and the to use a single spike as the output membership function
consequent parts of a rule can have multiple components. rather than a distributed fuzzy set. This is sometimes called a
For example the antecedent part can be: singleton output membership function and it can be
if temperature is high and sun is shining and pressure considered as a pre-defuzzified fuzzy set. It enhances the
is falling then… efficiency of the defuzzification process because it greatly
In this case all parts of the antecedent are calculated simplifies the computation required by the more general
simultaneously and resolved to a single number using the Mamdani method, which finds the centroid of a two-
logical operators described before. The consequent of a rule dimensional function. Instead of integrating across the two-
can also have multiple parts, as for example: dimensional function to find the centroid, the weighted
average of a few data points can be used.
if temperature is very hot then boiler valve is shut Sugeno method of fuzzy inference is similar to the
and mains water valve is open Mamdani method in many respects. The first two parts of the
In this case all consequents are affected equally by the fuzzy inference process, fuzzifying the inputs and applying
result of the antecedent. The consequent specifies a fuzzy set the fuzzy operator, are exactly the same. The main difference
be assigned to the output. The implication function then between Mamdani- and Sugeno-type of fuzzy inference is
modifies that fuzzy set to the degree specified by the that the output membership functions are only linear or
antecedent. The most common way to modify the output set constant for the Sugeno-type fuzzy inference. A typical fuzzy
are truncation using the min function. rule in a first-order Sugeno fuzzy model has the form
In general interpreting if – then fuzzy rules is a three-part If x is A and y is B then z ¼ px þ qy þ r ð18Þ
process:
where A and B are fuzzy sets in the antecedent, while p, q and
1. Fuzzify inputs. All fuzzy statements in the antecedent are r are all constants. Higher-order Sugeno fuzzy models are
resolved to a degree of membership between 0 and 1. possible, but they introduce significant complexity with little
2. Apply fuzzy operator to multiple part antecedents. If obvious merit. Because of the linear dependence of each rule
there are multiple parts to the antecedent, apply fuzzy on the system’s input variables, the Sugeno method is ideal
logic operators and resolve the antecedent to a single for acting as an interpolating supervisor of multiple linear
number between 0 and 1. controllers that are to be applied, respectively, to different
3. Apply implication method. The degree of support for the operating conditions of a dynamic non-linear system. A
entire rule is used to shape the output fuzzy set. The Sugeno fuzzy inference system is extremely well suited to the
consequent of a fuzzy rule assigns an entire fuzzy set to task of smoothly interpolating the linear gains that would be
the output. This fuzzy set is represented by a membership applied across the input space, i.e. it is a natural and efficient
function that is chosen to indicate the quantities of the gain scheduler. Similarly, a Sugeno system is suitable for
consequent. If the antecedent is only partially true, then modeling non-linear systems by interpolating multiple linear
the output fuzzy set is truncated according to the models.
implication method.

5.4. Fuzzy inference system 6. Hybrid systems

Fuzzy inference is a method that interprets the values in Hybrid systems are systems which combine two or more
the input vector and, based on some sets of rules, assigns AI techniques in order to perform a task. The classical
values to the output vector. In fuzzy logic the truth of any hybrid system is the neuro-fuzzy control, whereas other
statement becomes a matter of a degree. types combine GAs and fuzzy control or ANNs and GAs as
Fuzzy inference is the process of formulating the part of an integrated problem solution or in order to perform
mapping from a given input to an output using fuzzy specific separate tasks of the same problem. As most of
logic. The mapping then provides a basis from which these techniques are problem specific more details are given
decisions can be made, or patterns discerned. The process of here for the first category. Some application examples
fuzzy inference involves all of the pieces that are described belonging to the other categories mentioned above are given
so far, i.e. membership functions, fuzzy logic operators and in Section 7.6.
if – then rules. There are two main types of fuzzy inference A fuzzy system possesses great power in representing
systems that can be implemented, Mamdani-type [34] and linguistic and structured knowledge by fuzzy sets and
Sugeno-type [35]. These two types of inference systems performing fuzzy reasoning and fuzzy logic in a qualitative
vary somewhat in the way outputs are determined. manner and it usually relies in domain experts to provide the
Mamdani-type inference expects the output membership necessary knowledge for a specific problem. Neural net-
functions to be fuzzy sets. After the aggregation process, works on the other hand are particularly effective at
there is a fuzzy set for each output variable that needs representing non-linear mappings in computational fashion
defuzzification. It is possible, and sometimes more efficient, and they are ‘constructed’ through training procedures
538 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

presented to them as samples. Additionally, while the the membership function parameters that best allow the
behavior of fuzzy systems can be understood easily due to associated fuzzy inference system to track the given
their logical structure and step-by-step inference pro- input/output data. A neural network, which maps inputs
cedures, a neural network acts generally as a black-box, through input membership functions and associated par-
without providing explicit explanation facilities. The ameters, and then through output membership functions and
possibility of integrating the two technologies was con- associated parameters to outputs, can be used to interpret the
sidered quite recently into a new kind of system called input/output map. The parameters associated with the
neuro-fuzzy control where several strengths of both systems membership functions will change through a learning
are utilized and combined appropriately. process. Generally, the procedure followed is similar to
More specifically by neuro-fuzzy control it is meant [32]: any neural network technique described in Section 3.
It should be noted that this type of modeling works well
1. The controller has a structure resulting from a combi- if the training data presented to a neuro-fuzzy system for
nation of fuzzy systems and ANNs. training and estimating the membership function parameters
2. The resulting control system consists of fuzzy systems is representative of the features of the data that the trained
and neural networks as independent components per- fuzzy inference system is intended to model. However, this
forming different tasks. is not always the case, and data are collected using noisy
3. The design methodologies for constructing respective measurements or training data cannot be representative of
controllers are hybrid ones coming from ideas in all features of the data that will be presented to the model.
fuzzy and neural control. For this purpose, model validation can be used as in any
neural network system. Model validation is the process by
In this case a trained neural network can be viewed as a which the input vectors from input/output data sets that the
means of knowledge representation. Instead of representing neuro-fuzzy system has not seen before are presented to
knowledge using IF – THEN localized associations as in the trained system to check how well the model predicts the
fuzzy systems, a neural network stores knowledge through corresponding data set output values.
its structure, and more specifically its connection weights
and local processing units in a distributed or localized
manner. Many commercial software (like Matlab) include
7. Applications of artificial intelligence techniques
routines for neuro-fuzzy modeling.
in combustion processes
The basic structure of fuzzy inference system is described
in Section 5. This is a model that maps the input membership
AI techniques have been used by various researchers for
functions, input membership function to rules, rules to a set of
modeling and predictions in the field of combustion
output characteristics, output characteristics to output
engineering. This paper presents various such applications
membership functions, and the output membership function
in a thematic rather than a chronological or any other order.
to a single-valued output or decision associated with the
The applications for each AI technique are given indepen-
output. Thus, the membership functions are fixed. In this
dently and are separated into two broad areas: applications
way, fuzzy inference can be applied to modeling systems
dealing with combustion systems, which include boilers,
whose rule structure is essentially predetermined by the
furnaces, incinerators, waste disposal and emissions predic-
user’s interpretation of the characteristics of the variables in
tion, and applications dealing with IC engines modeling and
the model.
control, which include diesel and spark ignition engines and
There are some modeling situations in which the shape
gas turbines. The numbers of applications presented in each
of the membership functions cannot be determined by just
AI field are summarized in Table 1. As can be seen the
looking at the data. Instead of choosing the parameters
majority of applications deal with ANNs.
associated with a given membership function arbitrarily,
these parameters could be chosen so as to tailor the
membership functions to the input/output data in order to 7.1. Applications of expert systems
account for these types of variations in the data values. In
case that fuzzy inference is applied to a system for which a There are only a few applications of expert systems in
past history of input/output data is available these can be combustion engineering. Table 2 lists several representative
used to determine the membership functions. Using a given examples of the use of expert systems in combustion which
input/output data set, a fuzzy inference system can be are analyzed in this paper. No applications of expert systems
constructed, whose membership function parameters are are reported in the field of IC engines.
tuned or adjusted by using a neural network. This is called a
neuro-fuzzy system. Modeling of a front wall fired utility boiler for different
The basic idea behind a neuro-fuzzy technique is to operating conditions
provide a method for the fuzzy modeling procedure to learn Numerical modeling of a front wall pulverized coal fired
information about a data set, in order to compute utility boiler is presented by Xu et al. [36]. This is achieved
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 539

Table 1 Modeling and simulation for the fault diagnosis of rotary


Summary of numbers of applications presented in each field kiln incineration process
AI tech- Area Number of A rule-based and on-line expert system, named RKEX-
nique applications PERT, used to supervise the combustion operations of a
pilot scaled rotary kin incinerator with a special focus on the
Expert Combustion 5 fault diagnosis and feedback controls is presented by Cho
systems et al. [37]. The combustion process is routinely susceptible
to a large number of operational problems. Knowledge base
Artificial Combustion 22
neural Internal combus- 14 of RKEXPERT contains the heuristics and insights of
networks tion engines experienced operators and also the knowledge necessary for
problem solving related to diagnosis and control. The whole
Genetic Combustion 4
knowledge in this system was organized into a unique
algorithms Internal combus- 2
tion engines
hierarchical structure and represented using an IF – THEN
rule-based formation. Based on the rule-based expert system
Fuzzy logic Combustion 7 by on-line mode, the intelligent process operation aid
Internal combus- 7 system for the combustion plant was completed to provide
tion engines
integrity in optimal operation and control.
Neuro-fuzzy Combustion 2
Internal combus- 3
tion engines Low temperature combustion: automatic generation of
primary oxidation reactions and lumping procedures
Hybrid Combustion 3
systems Some general rules for the automatic generation of
primary oxidation reactions of large hydrocarbon fuels
are presented by Ranzi et al. [38]. The proposed
approach is applied to n-paraffins for simplicity reasons.
by comparison between the predictions and measurements Nevertheless, the final goal is to feed the tested rules and
in the boiler. The agreement of the calculation results with kinetic parameters into a more general and effective
respect of the experimental data at several operating expert system for the generation of primary mechanisms
conditions validates the models and algorithm employed of real mixtures containing heavier branched hydrocar-
in the computation. The need to improve burner or boiler bons. Fig. 18 shows the different phases of the proposed
design with the objective of producing NOx abatement while approach schematically. The first step is the classification
avoiding some negative side effects is also the motivation of the primary reactions involved in low-temperature
for the prediction of furnace performance for different boiler oxidation, together with the definition of a limited set of
operating conditions. The predicted results also lay a their intrinsic kinetic parameters. These independent rate
foundation for the boiler operating expert system. constants are validated based on primary experimental
measurements of pyrolysis and oxidation. In addition to
this, the authors analyze some useful simplifications for
the kinetic modeling of secondary combustion processes.
Table 2
Summary of applications of expert systems in combustion As the carbon number of the hydrocarbon fuel rises, the
detailed reaction schemes become very complex. The
Number Authors Reference Year Subject number of isomers of the same homologous class of
molecules and radicals increases and the number of
1 Xu et al. [36] 2001 Modeling reactions increases simultaneously. In these cases, the
of a front automatic generation of the primary oxidation reactions
wall boiler
and properly conceived lumping procedures, both for
2 Cho et al. [37] 1998 Fault diagnosis
reactions and components, become very useful. These
of incineration
3 Ranzi et al. [38] 1995 Automatic gen- lumped mechanisms of heavy species consist of a limited
eration of primary number of equivalent reactions. Then, this small subset
oxidation reactions of equivalent reactions has to be coupled with a very
for low temperature detailed scheme for the oxidation of C1 2 C4 species.
combustion The final result is still a very large number of reactions,
4 Basu and [39] 1994 Design of a boiler but whose extension to heavier species becomes easier to
Mitra furnace handle. A few comparisons of experimental and predicted
5 Kim et al. [40] 1992 Coal combustion by
results for the low-temperature oxidation of n-butane and
product utilization
n-pentane illustrate the applicability of this approach.
540 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

Fig. 18. Block diagram of the general approach for the automatic generation of primary oxidation reactions and lumping procedures [38].

Design of a furnace of a circulating fluidized-bed boiler An external program to calculate the embankment settle-
The design of a boiler using a new technology, i.e. ment was integrated into the system. A case study is
circulating fluidized bed combustion (FBC), requires a presented to demonstrate the use of ECES. This study
considerable amount of expertise, which is a combination of provides the embankment designer with a useful tool for
experience, knowledge of the subject, and intuition. Boiler making decisions on FGD by product application as an
vendors, who are required to prepare a large number of embankment material.
proposals, rely heavily on the skill and judgment of their
senior (expert) designers [39]. An AI based expert system 7.2. Applications of neural networks
can greatly simplify this task. This system can assist expert
designers to store their experience and decision-making skill There has been considerable interest in recent years in
through the code of a computer program, which remains the use of neural networks for the modeling and control of
intact and ready to apply their skill uniformly and rapidly to combustion processes because of their ability to represent
all designs when required. This may allow novice designers non-linear systems and their self-learning capabilities. A
to carry out routine proposal designs, freeing the experts to review of research carried out in the area of combustion
improve current designs. An illustration of the use of expert engineering is included in this section. For the interested
systems to the design of only one aspect of the furnace, reader a review of ANN applications in energy engineering
which is furnace cross-section is given by Basu and Mitra is presented in Ref. [41] and a review in renewable energy
[39]. It shows that in addition to the standard method of engineering is presented in Ref. [42].
determining the furnace area from the fluidization, the
design can take advantage of previous experience, which 7.2.1. Combustion
includes grate heat release rate and other relevant par- Table 3 lists several representative examples of the use
ameters. The expert system also modifies the calculated of ANNs in combustion which are analyzed in this paper.
value to meet different concerns of the boiler purchaser and/
or his consultants. Finally, the expert develops a compro- Monitoring of combustion pollutant emissions
mise of different considerations and requirements with The relevant issues associated with the development of
importance attached to each one of them. The paper also neural-based software sensors for monitoring the pollutant
shows how the design will change when the importance emissions coming out from combustion chambers are
attached to a particular constraint is relaxed. addressed in Ref. [43]. The objective was to prove the
potential of software sensors as alternative monitoring
Utilization of coal combustion by-product in highway systems to conventional analytical equipment. The prelimi-
embankment nary results presented refer to a 4.8 MW pilot plant operating
The development of a knowledge-based expert system at the Enel Santa Gilla Research Center in Calgary, Italy.
that helps to determine the suitability of flue gas desulphur-
ization (FGD) by-product in embankment construction is Online prediction of the free lime content in the sintering
introduced in Ref. [40]. The expert system shell ‘personal zone for process optimization
consultant plus’ was used for this purpose. The knowledge The phase formation in clinker in the sintering zone is
base in the embankment construction expert system (ECES) one of the critical operating processes in cement manufac-
consists of a root frame and four sub-frames, and ture. The state of the art for checking and optimizing the
accommodates the production rules obtained from sintering process is thermography, i.e. measuring
laboratory tests as well as an extensive literature review. the temperature distribution using infrared cameras.
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 541

Table 3
Summary of applications of artificial neural networks in combustion

Number Authors Reference Year Subject

1 Tronci et al. [43] 2002 Monitoring of combustion emissions


2 Schmidt and Schmidt [44] 2001 Prediction of free lime content for process optimization
3 Chong et al. [45] 2001 Prediction of gaseous emissions from a stocker boiler
4 Blonbou et al. [46] 2000 Active control of combustion instabilities
5 Chong et al. [47,48] 2000 Development of a controller for a stocker fired boiler
6 Ward et al. [49] 1999 Prediction of the thermal performance of a high-temperature
furnace
7 Ferretti and Piroddi [50] 2001 Estimation of NOx emissions in power plants
8 Larachi [51] 2001 Kinetic prediction of coke burn-off on spent wet oxidation catalysts
9 Blonbou et al. [52] 2000 Active adaptive combustion control
10 Blasco et al. [53] 2000 Simulation of combustion process
11 Blasco et al. [54] 1999 Simulation of methane–air combustion
12 Zhu et al. [55] 1999 Prediction of coal/char combustion rate
13 Liu and Daley [56] 1999 Predictive control of unstable combustion systems
14 Zbicinski et al. [57] 1999 Pulse combustors for drying applications
15 Blasco et al. [58] 1998 Modeling the temporal evolution of a combustion system
16 Christo et al. [59] 1996 Chemical reactions representation
17 Altug and Tulunay [60] 1996 Process controller for a fluidized bed combustion
18 Leib et al. [61] 1996 Modeling and simulation of a fluidized bed reactor
19 Debeda et al. [62] 1995 Thick-film pellistor array development
20 Allen et al. [63] 1993 Combustion control system for boiler applications
21 Muller and Keller [64] 1996 Modeling of the combustion process of incineration plants
22 Christo et al. [65] 1995 Turbulent combustion modeling

The multi-sensor described in Ref. [44] based on neural The resultant ‘black-box’ models of the oxygen concen-
networks was developed specifically for application in tration, nitrogen oxides and carbon monoxide in the exhaust
industrial combustion systems with heavy dust burdens, and flue gas were able to represent the dynamics of the process
is based on an air-cooled two-channel endoscope. Video and and delivered accurate one-step-ahead predictions over a
thermographic images as well as various other combustion wide range of data completely unknown to the network. This
attributes are recorded using a video, glass-fiber, or diode system identification approach is an alternative to the
camera. The second optical output is used to take a glass- mathematical modeling of the physical process, which
fiber camera, which carries the kiln radiation directly to a although lacking in model transparency and elegance, is
special radiation receiver in the field computer. The data, able to produce accurate one-step-ahead predictions of the
which are generated, can be used for online prediction of the derivatives of combustion. This has been demonstrated not
free lime content using a mathematical model. Links to only with data sets that were obtained from the same series
various process control systems are available via the of experiments, which also demonstrated the repeatability of
developed software, which in addition to the thermographic the model, but also for data with a temporal separation of
display creates various options for displaying the process almost eight months from the training data set.
dynamics [44]. According to the authors by using the
developed hardware and software, it is possible to achieve Active control of combustion instabilities on a Rijke tube
various optimization objectives simultaneously and online. The active control of combustion instabilities with
Continuous online determination of the free lime content feedback is a promising new tool. In their work, Blonbou
enables the kiln to be fed with a continuous, constant and et al. [46] initially described a Rijke tube that presents, for
minimized supply of energy which is adjusted to suit the some operating conditions, instabilities with pressure levels
process. This leads to substantial savings in primary fuel and up to 145 dB/Hz. To control these instabilities, an internal
better control of the emission of harmful substances. model (IM) control scheme for non-linear systems that uses
two ANNs is developed. The first one, the IM, approximates
Prediction of the gaseous emissions from a chain grate the system forward dynamic. In the second one, the
stocker boiler controller calculates the control input. The controller’s
The application of feedforward multi-layered perceptron parameters are updated adaptively in real time. The IM was
neural networks as a simplistic means to model the gaseous first trained to reproduce the burner response (given by the
emissions emanating from the combustion of coal on pressure or heat-release measurements) to open loop
a chain-grate stoker-fired boiler is presented in Ref. [45]. excitation. It was then used in the control loop to predict
542 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

Fig. 19. Chain grate stocker fired boiler controller strategy.

the response of the burner to the control action. The adaptive the flue gas on-line and feeding the error into the corrective
control algorithm used this prediction to update the network, which then delivers the amount of the necessary
controller’s parameters. The developed controller is able adjustment in order to consistently maintain an optimum air/
to attenuate the instabilities in real time for fixed or variable fuel ratio during steady-state combustion. The neural
operating conditions successfully and pressure level attenu- network based decision maker has one input (sample oxygen
ation down to 240 dB/Hz has been obtained. reading), two outputs (adjustment of the primary air flow to
the middle two sections of the grate) and a hidden layer with
Controller for an industrial grate stoker fired boiler 25 neurons. The function of this network is to deliver a
decision as to how much primary air to the main combustion
A novel Neural Network Based Controller (NNBC) has
zone requires adjustment, during steady-state, in order to
been developed by Chong et al. [47], based on a
maintain the oxygen concentration at the optimum level.
comprehensive set of experiments carried out on a pilot-
Test results for the controller have demonstrated that
scale stoker test facility. Subsequently, the neural control
improved transient and steady-state combustion conditions
system has been applied on an industrial chain grate stoker
were achieved without adversely affecting pollutant emis-
fired boiler. For this application, data collected during 300 h
sions or the integrity of the stoker and boiler system.
of experiments contacted on an actual 0.75 MW chain grate
According to the authors, the NNBC was more consistent at
stocker boiler were used to train two neural networks. The
maintaining a minimum excess air level, in addition to
NNBC control strategy is shown in Fig. 19 and its objective is
executing the staging sequence for large load changes with
to provide the minimum airflow required for combustion
better precision. The excess air was optimized whilst
without incurring unacceptably high CO emissions and
avoiding unacceptable levels of CO emissions and carbon
carbon in ash losses, thus improve the combustion efficiency.
in ash, thereby improving the combustion effectiveness [48].
The system detects changes in load required and gives the
The prototype NNBC thus has potential to provide both
matching coal feed and airflow. Initially, the ‘setting
stoker manufacturers and users with a means to control
network’ acts as a look-up table to provide near optimum
pollutant emissions as well as improving the combustion
settings of coal feed and airflow for the desired load. The
efficiency of this type of coal-firing equipment.
setting network consists of one input layer (load demand), six
outputs (grate and rotary valve speeds and four primary air
flow rates) and a hidden layer with 30 neurons. A multi-layer Prediction of the thermal performance of a high-
perceptron (MLP) neural network was employed trained with temperature furnace
the standard BP algorithm. According to the authors, after The results of a study of the ability of two different neural
training the network could map the given boiler load to the networks to match the outputs of a radiation zone model,
desired coal feed and airflow settings with an acceptable
Table 4
accuracy. The controller essentially mimics the human Input and output vectors of the neural network model
operator when operating the boiler after start-up. It detects
changes in the load required and delivers the matching coal Input vectors Output vectors
feed and air from the setting network. If the magnitude of the
load change is higher than a pre-set limit, then an optimum Five parameters Three parameters
staging profile of coal and air feed is delivered based on past Preheat temperature (K) Load surface temperature (K)
experience. After quasi-steady-state conditions are reached Fuel mass flow rate (kg/s) Heating time (h)
the second phase of the neural network based controller is Excess air level (%) Total fuel used (kg)
Set point temperature (K)
activated to fine tune the primary airflow rate in order to
Allowable load temperature
optimize the combustion process on the grate. This is difference (K)
achieved by monitoring the concentration of oxygen in
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 543

which was used to predict the transient thermal performance ANN-based modeling approach proved to be an accurate,
of a gas-fired furnace, are presented by Ward et al. [49]. A reliable and effective tool for the quantification of the coke
series of small, single output neural networks were found to burn-off kinetics. The method has great potential as a means
provide adequate representation of the mathematical model. to compensate for the lack of efficient phenomenological
Two different types of neural networks were investigated in kinetic modeling techniques. It can also be used in the design
the study, a standard feedforward MLP and a recurrent neural of regenerative units downstream of catalytic wastewater
network. The input and output vectors used in the networks treatment reactors based on wet oxidation.
are shown in Table 4. The feedforward neural network
consisted of three layers (input, output and one hidden) and it Active adaptive combustion control
was trained with a BP algorithm. The recurrent neural The suppression of pressure oscillations in combustion
network was similar to the feedforward one with the addition chambers through the use of active feedback control is a new
of five ‘context units’ added to the input layer and was also technology with high potential. A feedback control strategy
trained using BP technique. The context units remember the based on an internal model (IM) control system for non-linear
past inputs and feed this information into the hidden layer. plants that uses ANNs is presented in Ref. [52]. This control
This means that events that have just happened are stronger system uses two neural networks: The IM, which approxi-
than the ones that have occurred in the past. mates the plant forward dynamics and a controller, which
gives the appropriate control input. The controller’s
Estimation of NOx emissions in thermal power plants parameters are updated adaptively for that purpose. The
Ferretti and Piroddi [50] proposed a neural network- capabilities of the developed control system in a numerical
based strategy for the estimation of the NOx emissions in simulation of control of combustion instabilities have been
thermal power plants, fed with both oil and methane fuel. A demonstrated numerically by the authors. Then, the ability of
detailed analysis based on a three-dimensional simulation of this neural network based control system to actively damp
the combustion chamber has pointed out the local nature of instabilities in a Rijke-tube burner has been demonstrated.
the NOx generation process, which takes place mainly in the
burners’ zones. This fact has been suitably exploited in Chemistry representation in combustion applications
developing a compound estimation procedure, which makes Several alternative techniques have been proposed in the
use of the trained neural network together with a classical literature in order to avoid the CPU-intensive numerical
one-dimensional model of the chamber. Two different integration of the thermochemical equations in the simu-
learning procedures have been investigated, both based on lation of combustion processes. A new approach, which is
the external inputs to the burners and a suitable mean cell based on two artificial neural-network paradigms, namely the
temperature, while using local and global NOx flow rates as self-organizing map (SOM) and the MLP is presented in Ref.
learning signals. The approach has been assessed with [53]. The SOM is first employed for the automatic
respect to both simulated and experimental data. partitioning of the thermochemical space into sub-domains.
Then, a specialized MLP is trained in order to fit the
Kinetic prediction of coke burn-off on spent wet oxidation thermochemical points belonging to a given sub-domain. The
catalysts presented strategy is tested on a partially stirred reactor with a
Combustion kinetics of coke laydown on wet oxidation reduced methane – air mechanism, and encouraging results
catalysts is studied by means of temperature-programmed are reported. The relatively modest CPU-time and memory
oxidation (TPO) and mass spectrometry (MS) in the requirements of the method make the SOM – MLP approach a
temperature range from 30 to 600 8C [51]. The study is promising technique for the inclusion of large chemical
designed to allow a better understanding of the influence of mechanisms in the context of complex applications, such as
wet oxidation conditions (catalyst-phenol contacting time the multi-dimensional simulation of combustion.
and temperature) on the combustion kinetics of coke
deposition during phenol degradation. In this respect, the A single-step time-integrator of a methane – air chemical
experimental procedure involves the continuous monitoring system
of carbon oxides and O2 fluxes resulting from the combustion Blasco et al. [54] reported a novel method for embedding
of carbonaceous deposits. Based on the experimental data, an a reduced chemical system, suitable for the simulation of
ANN-based modeling approach is implemented to represent, methane– air combustion, in an ANN. The use of ANNs as a
as accurately as possible, the complex combustion phenom- means of storing in a compact manner the chemical kinetics
enon to provide the opportunity to predict its evolution. In of a system is an emerging alternative to other methods, the
this context, the resulting ANN model is used as a black-box full potential of which remains to be exploited. This work
to approximate the complex non-linear conversion rate of the presented two novelties. The first one is that the compo-
wet oxidation coke. The conversion rate is thus expressed in sitional domain is split into sub-domains, for each of which
terms of the TPO ramp temperature, running oxygen an ANN fitting is attempted, and the second is that the
concentration, wet oxidation temperature, and phenol time step is introduced as an additional input to the network,
oxidation time. According to the authors, the proposed thus increasing the accuracy and speed of the method [54].
544 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

The chemical system consists of four steps and seven various experimental parameters, for example, combustion
reactive species as follows: temperature and coal structure parameters. The reactivity/
combustion rate of a char increases with increasing combus-
CH4 þ 2H þ H2 O $ CO þ 4H2
tion temperature and decreasing rank of the parent coal. It is
also known that the char reactivity is affected by the thermal
CO þ H2 O $ CO2 þ H2
history of the char, e.g. the heating rate, maximum heat
2H þ M $ H2 þ M treatment temperature, residence time of pyrolysis, and it may
be related to the surface area of the char. Therefore,
O2 þ 3H2 $ 2H þ 2H2 O ð19Þ parameters reflecting the rank of parent coals, the severity
of pyrolysis and total surface area of the chars as well as the
The thermochemical state of the mixture can be fully combustion temperature are chosen as inputs to train a neural
described by specifying the value of only five variables network to determine a char combustion rate. The rank of a
under the following hypotheses: equal diffusivity of all the coal can be represented by the vitrinite reflectance, the fixed
species and enthalpy, no heat losses by radiation, constant carbon content or volatile matter content of the coal. The
pressure and ideal-gas density. These five controlling scalars vitrinite mean random reflectance R0 and fixed carbon content
are chosen to be the mixture fraction and the specific of a coal increase whereas the volatile matter content of a coal
number of moles (or moles per unit mass) of CH4, CO, O2, decreases with increasing coal rank. Therefore, any one of
and H. These five controlling scalars are expressed as,
these parameters can be used as a coal rank parameter. The
mixture fraction f, nCH4, nCO, nO2, and n H.
hydrogen content of a coal is gradually lost during pyrolysis
All the ANNs reported in this work are based on the
and the extent of hydrogen loss depends on the pyrolysis
well-known MLP network with two hidden layers as shown
conditions such as heating rate, heat treatment temperature
in Fig. 20.
and residence time, etc. Hence, it is reasonable to use the
The ANN receives as an input the values of the controlling
extent of hydrogen loss to represent the severity of pyrolysis.
scalars ( f, nCH4, nCO, nO2, and n H) and the desired time step
It is well recognized that coal combustion reaction
ðDtÞ: The input is processed by the network, which provides as
involves many steps:
output the change in the controlling scalars over the given
time step (DnCH4, DnCO, DnO2, and Dn H). The mixture
fraction ðf Þ is an input but not an output, since as a passive † mass transfer (by diffusion) of oxygen from the bulk gas
scalar it does not change with time. The paper introduces three phase to the char surface and the gaseous products from
alternative types of networks, and describes in detail the the char surface to the bulk gas phase,
methodology used for their construction and validation. The † adsorption of oxygen on and desorption of the gaseous
level of accuracy attained is at least one order of magnitude products from the char surface, and
better than with previously published ANN approaches. † rearrangements of the adsorbed surface species (surface
Furthermore, the ANN predictions are performed with low reactions).
CPU and memory requirements compared to alternative
methods, thus allowing the inclusion of the ANN approach in The apparent reaction rate is governed by the slowest
other computer-intensive simulation techniques. one, i.e. the rate-determining step. Temperature has
profound effects on chemical reaction rate and diffusion
Prediction of coal/char combustion rate processes. Coal char combustion processes can therefore be
The use of an ANN for predicting the reactivity of divided into three regimes according to the temperature
coal/char combustion was investigated in Ref. [55]. A range applied: chemical control region, transition region and
feedforward neural network with BP learning has been diffusion control region. Different mechanisms are in
trained to predict the combustion rate of coal chars. It is well operation in each regime and these will require separate
established that the combustion rate of a char is a function of models. In the present study [55], the neural network was

Fig. 20. Schematic of the neural network.


S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 545

Fig. 21. Output-model-based predictive control.

trained only to estimate the char-combustion rate at low good accuracy and robustness for all three sets of input data
temperature in the first regime where the combustion shown above. Total surface areas of the chars correlated to
reaction is under chemical control. the combustion rates and when these values were used as
A database containing the combustion rate reactivity of one of the inputs to the neural network, better predictions
55 chars derived from 26 coals covering a wide range of were achieved.
rank and geographic origin was established to train and test
the neural networks. The heat treatment temperature of the
chars ranged from 1000 to 1500 8C and the combustion rate Predictive control of unstable combustion systems
reactivity of the chars were measured using thermogravi- A novel strategy for the active stabilization of combus-
metric analysis in a temperature range of 420–600 8C. tion systems is presented by Liu and Daley [56]. In terms of
The following three sets of inputs were used, respect- their impact on the system performance, pressure oscil-
ively, to train the neural network: lations are of the most significance. In some applications,
the pressure oscillations are undesirable since they result in
1. R0 ; Hchar/Hcoal, A; T; excessive vibration, causing high levels of acoustic noise
2. CF, Hchar/Hcoal, A, T; and in extreme cases mechanical failure. In the frequency
3. CF, Hchar/Hcoal, T; domain, the pressure is characterized by dominant peaks at
discrete frequencies, which correspond to the acoustic
Here, the ratio of char hydrogen content (daf wt%) to modes of the combustion chamber.
coal hydrogen content (daf wt%), Hchar/Hcoal, was used to The algorithm developed comprised of three parts: an
reflect the severity of pyrolysis of the char. This ratio along output model, an output predictor and a feedback controller.
with combustion temperature T (K) was included in all three The output model, which is established using neural
sets of inputs. In the first set of inputs, the vitrinite mean networks, is used to predict the output in order to overcome
random reflectance R0 (%) was used as a coal rank the time delay of the system, which is often very large
parameter. The total surface area of the chars A (m2/g) compared with the sampling period. An output-feedback
was also used as one of the inputs. In the second set of controller is introduced which uses the output of the
inputs, the fixed carbon content CF (daf wt%) of the parent predictor to suppress instability in the combustion process.
coal was used as a coal rank parameter instead of R0 ; The output-model-based predictive control using neural
whereas the rest of the parameters were the same as those networks is shown in Fig. 21. The approach developed by
used in the first set of inputs. The parameters used in the the authors is first demonstrated using a simulated unstable
third set of inputs were same as those in the second set combustor with six modes.
except that the total surface area A was excluded. To avoid The output-model-based predictive control using neural
saturation of the neurons, all the input and output values networks has been evaluated using an atmospheric
were normalized between 0 and 1. combustion test rig with a commercial combustor. A
The results showed that when sufficient amount of schematic diagram of the active control system for the
training data are available, a neural network model could be combustor test rig, using a loudspeaker actuation device,
developed to predict the combustion rates of coal chars with is shown in Fig. 22.

Fig. 22. Active control system of the combustion test rig [56].
546 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

Pulse combustors for drying applications. The aim of the reactive-species ANN is to predict, given
Results of investigations of a valved pulse combustor to a composition at the beginning of a time step, the mass
choose optimal geometry, which covered measurements of fractions of the species at the end of this time step. The time
the flow rates of air and fuel, pressure oscillations, including step is fixed for the network, and thus three different ANNs
pressure amplitude and frequency and flue gas composition are used (for Dt ¼ 1025 ; 1024 and 1023 s) to cover the span
are presented by Zbicinski et al. [57]. Experimental studies of timescales which are encountered in a typical combustion
comparing the operation of the pulse combustor when simulation. Of course, intermediate time steps can be
coupled with a drying chamber and when working separately simulated by dividing them into smaller ones having the
are described. It was found that coupling of the pulse above sizes.
combustor with a drying chamber had no significant effect on The goal of the temperature—density ANN is to model
the pulse combustion process. Smoother runs of pressure the time-independent relationship between these quantities
oscillations in the combustion chamber, lower noise level and and the composition (mass fractions) of the system. Hence, a
slightly higher NO emission were observed. The velocity flow single ANN suffices.
field inside the drying chamber was measured and the results The independent variables chosen as the inputs to both
confirmed a complex character of pulsating flow in the ANNs are the set of eight mass fractions. Alternatively, the
chamber. A large experimental data set obtained from five independent controlling scalars could be used as input
measurements enabled developing a neural model of pulse and output for the reactive-species ANN, which has the
combustion process. ANNs were trained to predict ampli- advantage of ensuring the conservation of the number of
tudes and frequencies of pressure oscillations, temperatures element atoms. However, when this was done in the present
in the combustion chamber and emission of toxic substances. work, higher errors were attained. This fact could be
According to the authors, an excellent mapping performance attributed to a more complex functional relationship
of the developed neural models was obtained. Due to complex between the input and output for the controlling scalars
character of the pulse combustion process, the application of than for the mass fractions.
ANNs seems to be the best way to predict inlet parameters of a The performance in terms of accuracy of the networks is
drying agent produced by the pulse combustor. assessed by comparison with the results of the direct
integration of the thermochemical system for a large number
Modeling the temporal evolution of a reduced combustion of random samples. Error measurements are reported, and
chemical system sample evolutions of the chemical system with both
A way of embedding a combustion chemical system in a methods are compared. According to the authors, the results
neural network, in such a way that it can be used, with of this exercise are satisfactory, and the CPU-time and
considerable CPU time and RAM memory savings, in fluid- memory savings encouraging.
flow-simulation codes is presented in Ref. [58]. The aim of
the model is the generation of two types of ANN, one Representation of chemistry with probability density
representing the temporal evolution of the reactive chemical function (pdf) simulation of H2/CO2 flames
species and another one for the temperature and density of A novel approach using ANNs for representing chemical
the mixture as a function of the chemical composition of the reactions is developed and successfully implemented with a
system. These are shown schematically in Fig. 23. modeled velocity-scalar joint pdf transport equation for

Fig. 23. Neural network architectures showing inputs and outputs used to approximate the reactive species mass-fraction and temperature and
density evaluation.
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 547

H2/CO2 turbulent jet diffusion flames by Christo et al. [59]. the fluidized-bed where the flow patterns for the bubble
The chemical kinetics are represented using a three-step and emulsion phases in each cell are assumed to be plug-
reduced mechanism, and the transport equation is solved by flow and perfectly mixed, respectively. The intrinsic
a Monte Carlo method. A detailed analysis of computational kinetics, which are taken from the literature, are based
performance and a comparison between the neural network upon a model that exhibits a non-linear dependence on both
approach and other methods used to represent the chemistry, molecular oxygen and propylene. A series– parallel reaction
namely the look-up table, or the direct integration network describes the formation of acrolein, acetaldehyde,
procedures, are presented [59]. A MLP architecture with and total combustion products. The fluidized bed model
two hidden layers is chosen for the neural network. A accounts for variable gas velocity as well as finite transport
hyperbolic-tangent (tanh) function is used as a transfer resistance between the bubble and emulsion phases. To
function. The training algorithm is based on a BP supervised perform the required ANN model training, output responses
learning procedure with individual momentum terms and predicted from the cell model are first generated by using all
adaptive learning rate adjustment for the weights matrix. A possible combinations of 11 key input parameters varied
new procedure for the selection of training samples using over practical ranges of interest. The axial variation of the
dynamic randomization is developed by the authors and is nine output responses is represented by a recurrent ANN.
aimed at reducing the possibility of the network being The ANN parameters are then identified using a special-
trapped in a local minimum. This algorithm achieved an purpose computer software package that implements both
impressive acceleration in convergence compared with the training and analysis of the input data and corresponding
use of a fixed set of selected training samples. The feasibility output responses. To simulate the behavior of a real reactor,
of using neural network models to represent highly non- the output responses are corrupted with random noise.
linear chemical reactions is also illustrated in this work. Comparisons between the output responses obtained from
According to the authors the prediction of the flow field and the ANN model trained to noisy data to those from the cell
flame characteristics using the neural network approach is in model with no noise indicate that the ANN model is capable
good agreement with those obtained using other methods, of providing filtering. Furthermore, a sensitivity analysis
and is also in reasonable agreement with the experimental indicates that the ANN model captures the dependence of
data. The computational benefits of the neural network the output variables on the input ones.
approach over the look-up table and the direct integration
methods, both in CPU time and RAM storage requirements Thick-film pellistor array development
are not great for a chemical mechanism of less than three In pellistor gas sensors, the heat exhaust produced by the
reactions. The neural network approach becomes superior, catalytic combustion of reducing gases increases the
however, for more complex reaction schemes. temperature of the device [62]. A typical pellistor consists
of a platinum (Pt) wire supported in an alumina bead
Process controller for a fluidized bed combustion impregnated with a finely dispersed noble metal like
A neural network-based controller utilizing a modified palladium (Pd). The platinum wire serves as heater of the
error term in the back propagation algorithm is presented for bead to its operating temperature and as a thermometer. In
the purpose of on-line control of the non-linear time-varying reality, the temperature measured by the resistance of the Pt
FBC process [60]. The aim of the modification of the error wire is compared to that of a reference element, which has a
term was to improve the controller performance by enabling similar structure, but without any catalytic activity. No
the neural network to perform a ‘negative hysteresis’ action. selectivity of such a device has to be expected since the
The general design steps, alterations in the controller and catalytic combustion of any combustible gas will lead to a
performance tests were initially carried out on a simulation temperature increase of the device. In order to try to achieve
model of the process. According to the authors, the proposed selectivity to methane, initially, the differential activity of
controller is successful in accomplishing the bed tempera- palladium and platinum by using two screen-printed
ture control of the FBC process in the absence of any human pellistors, one based on Pd and the other on Pt were
operator. Furthermore, performance tests made on the exploited [62]. At around 400 8C, all reducing gases
simulation model showed that without the presence of including methane are oxidized by Pd, whereas Pt oxidized
offline pre-training, the proposed controller performs better all gases except methane. In order to extend the recognition
than the conventional controller in convergence time and process to combustible gases other than methane, that is to
overshoot. propane and ethanol vapor, a small array of four pellistors
with various percentages of Pd and Pt has been elaborated
Fluidized bed reactor model for propylene partial oxidation with thick film technology, which is very valuable for
The use of a neural network model to simulate the realizing series of similar sensors, required in arrays. The
performance of a fluidized-bed reactor for the partial four micro-calorimetric sensors are exposed to various gases
oxidation of propylene to acrolein is investigated by Leib and various concentration values. Recognition of methane,
et al. [61]. The training set needed to generate the ANN propane and ethanol is obtained by neural network
model is obtained from a two-phase cell model of techniques. The network consists of three layers: an input
548 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

layer, a hidden layer and an output layer, which permits gas incineration using ANNs has been presented which
identification. BP is used as the learning algorithm. simulates effectively the process parameters for predictive
control strategies.
Combustion control-system for utility boiler applications
A novel control system was developed and demonstrated Turbulent combustion modeling
by Allen et al. [63]. Advances in sensor and control system Christo et al. [65] applied ANNs successfully to model the
strategies for utility boiler combustion systems are required chemical reactions in turbulent combustion simulations.
in order to meet increasingly stringent standards on Chemical reactions are highly non-linear functions of
emission. In many instances, this may involve selective, temperature and concentrations. Input data used for training
single-burner monitoring and control. Using an electronic the network are the mixture fraction, and molar abundance of
camera to image the time-resolved, chemically specific CO2, H, and H2O whereas the output represents the change in
emission patterns in the flame and a neural network to composition of the reactive scalars, CO2, H, and H2O, over a
process and analyze the resultant images, a model utility certain reaction time. The network showed good ability in
load controller was demonstrated in a laboratory liquid- capturing the general behavior of chemical reactions.
fueled spray flame facility. Characterization tests of the Reasonable accuracy was obtained except for a few samples,
system’s dynamic stability were used to optimize the which constituted less than 10% of the sample space.
controller’s image processing, neural network and control
algorithms. A series of closed-loop control tests character- 7.2.2. Internal combustion engines
ized the system’s response to a variety of simulated load Table 5 lists several representative examples of the use
excursions. With the purely spatial information from the of ANNs in IC engines, which are analyzed in this paper.
imaging sensor, the control system successfully interpreted
the images and guided the burner through the imposed Diesel engine controller design
excursions. According to the authors, stable closed-loop Advanced engine control systems require accurate
control was demonstrated. dynamic models of the combustion process, which are
substantially non-linear. The application of fast neural
Modeling of the combustion process of incineration plants network models for engine control design purposes is
Muller and Keller [64] used ANNs to model the presented by Hafner et al. [66]. In this work the special
combustion process of incineration plants with the objective local linear radial basis function network (RBFN) is
to optimize the reduction of toxic emissions. Waste initially introduced followed by a description of the
incineration is a dynamic process with strong inherent process of building adequate dynamic engine models.
non-linearities and large time lags. As outlined by the These neuro-models are then integrated into an upper-level
authors analytic models fail due to the complexity of emission optimization tool, which calculates a cost
the process. A fully autonomous simulation of waste function for exhaust versus consumption/torque and
Table 5
Summary of applications of artificial neural networks in internal combustion engines

Number Authors Reference Year Subject

1 Hafner et al. [66] 2000 Diesel engine controller design


2 Heister and Froehlich [67] 2001 Non-linear time series analysis of combustion
pressure data
3 Icerman and Hafner [68] 2001 Control of mechatronic combustion engines
4 Park et al. [69] 2001 Spark advance control using cylinder pressure
5 Li and Yurkovich [70] 2000 Control for idle speed regulation in internal
combustion engines
6 Sharkey et al. [71] 2000 Fault diagnosis of a diesel engine
7 Thompson et al. [72] 2001 Modeling of emissions and performance of a
heavy-duty diesel engine
8 Ortmann and Glesner [73] 1998 Development of a knock detector
9 Muller et al. [74] 1997 Engine combustion control
10 Sharkey et al. [75] 1996 Fault diagnosis in a marine diesel engine
11 DeNicolao et al. [76] 1996 Modeling of the volumetric efficiency
12 Stevens et al. [77] 1995 Performance of a spark-ignition engine
13 Morita [78] 1993 Optimization control for combustion parameters
of petrol engines
14 Korb et al. [79] 1999 Dynamic modeling of a gas engine
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 549

determines optimal engine settings. According to the Spark advance control using cylinder pressure
authors, the system allows a fast application of the A spark advance control strategy based on the location of
optimization tool at the engine test stand. peak pressure (LPP) in spark ignition engines using ANNs is
presented in Ref. [69]. The problems of the LPP-based spark
Non-linear time series analysis of combustion pressure data advance control method are that many samples of data are
Heister and Froehlich [67] work is concerned with the required and there is a problem detecting the combustion
identification of the most relevant input – output data pairs for phasing owing to hook-back during lean burn operation. In
neural networks, using the concept of mutual information. In order to solve these problems, a feedforward MLP network
recent years, after a period of disillusion in the field of neural (MLPN) is developed. The LPP and hook-back are estimated
using the MLPN, which needs only five samples of output
processing and adaptive algorithms, neural networks have
voltage from the cylinder pressure sensor. The estimated LPP
been reconsidered for solving complex technical tasks. The
can be regarded as an index for combustion phasing and can
problem of neural network training is the presentation of
also be used as a minimum spark advance for best torque
input – output data showing an appropriate information
control parameter. The performance of the spark advance
content which represent a given problem. The training of a
controller is improved by adding a feedforward controller,
neural structure will definitely lead to poor results if the
which reflects the abrupt changes of the engine operating
relation between input and output signals shows no
conditions such as engine speed and manifold absolute
functional dependence but a pure stochastic behavior. A
pressure. The feedforward controller consists of the RBFN,
general, quantitative method is demonstrated for identifying
and the feedback error learning method is used for the
the most relevant points from the transient measured data of a
training of the network. In addition, the proposed control
combustion engine. In this context, mutual information is
algorithm does not need sensor calibration and pegging (bias
employed for the problem of determining the 50% energy
calculation) procedures because the MLPN estimates the
conversion point solely from the combustion chamber
LPP from the raw sensor output voltage. The feasibility of
pressure during one combustion cycle.
this methodology to control spark advances is closely
examined through steady and transient engine operations.
Optimal control of mechatronic combustion engines The experimental results have shown that the LPP shows
After a short review on mechatronic systems in general, favorable agreement with the optimal value even during the
the development of IC engines is discussed in Ref. [68]. transient operation of the engine.
Since the introduction of microprocessor control around
1980 an increasing number of sensors, actuators and digital Control for idle speed regulation in IC engines
control functions were introduced, replacing mechanical An adaptive sliding mode control design method is
devices like ignition breaker and injection. In addition, proposed for discrete non-linear systems where explicit
several formerly mechanical components with fixed func- knowledge of the system dynamics is not available. Three-
tions became active, manipulated components like the layer feedforward neural networks are used as function
electronic throttle, injection, camshaft and valves. Further, approximator for the unknown dynamics [70]. The control
integrated sensor systems came into mass production, like law is designed based on the outputs of the approximators,
knock and speed sensors. Thus, an integration of and the sliding surface is defined in terms of a stable
mechanical, electromechanical and electronic components polynomial of the system outputs. Convergence of the state
and an integration by software-based functions can be trajectories into a small sliding sector is proved. The method
observed which is typical for mechatronic systems. As the is applied to the IC engine idle speed control problem.
calibration and parameter tuning has become a crucial part
in the timely development because of complex interactions Fault diagnosis of a diesel engine
and the many degrees of freedom, it is shown how by A multi-network fault diagnosis system designed to
specially designed experiments for the identification of the provide an early warning of combustion-related faults in a
engine a considerable improvement can be reached with diesel engine is presented by Sharkey et al. [71]. Two faults
the aid of local linear neural networks. The engine models (a leaking exhaust valve and a leaking fuel injector nozzle)
are then used for the optimization of static and dynamic were physically induced (at separate times) in the engine. A
engine control, using a multi-objective performance pressure transducer was used to sense the in-cylinder
criterion for fuel consumption and emissions. Several pressure changes during engine cycles under both of these
results are shown for diesel engines with variable geometry conditions, and during normal operation. Data correspond-
turbo-chargers and exhaust gas recirculation. The way on ing to these measurements were used to train ANNs to
how model-based control systems can be implemented and recognize the faults, and to discriminate between them and
tested with rapid control prototyping systems, is presented. normal operation. Individually trained networks, some of
A further typical mechatronic design tool is the hardware- which were trained on sub-tasks, were combined to form a
in-the-loop simulation of the real-time simulated engine multi-network system. The multi-network system is shown
and the real electronic control unit. to be effective when compared with the performance of
550 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

the component networks from which it was assembled. The Engine combustion control
system is also shown to outperform a decision-tree To be able to meet the demands of low emissions and
algorithm and a human expert. Comparisons presented fuel consumption of modern combustion engines, new ways
show the complexity of the required discrimination. The have to be found to control the combustion. New sensors
results illustrate the improvements in performance that can have been used to measure the pressure in the combustion
come about from the effective use of both problem chamber and analyze this signal with a neural network in
decomposition and redundancy in the construction of order to receive several form factors, which can be used to
multi-network systems. control the ignition timing [74]. The neural network is
trained off line with measured data and used on line to derive
the form factors. The proposed algorithm can be computed
Modeling of the emissions and performance in real time on conventional digital signal processors and
of heavy-duty diesel engine adapted to new engines with very little effort.
IC engines are being required to comply with increas-
ingly stringent government exhaust emissions regulations. Fault diagnosis in a marine diesel engine
Compression ignition piston engines will continue to be The development of a neural network system for fault
used in cost-sensitive fuel applications such as in heavy- diagnosis in a marine diesel engine is described in Ref. [75].
duty buses and trucks, power generation, locomotives and ANNs were trained to classify combustion quality on the
off-highway applications, and will find application in hybrid basis of simulated data. Three different types of data were
electric vehicles. Close control of combustion in these used: pressure, temperature and combined pressure and
engines will be essential to achieve ever-increasing temperature. Subsequent to training, three ANNs were
efficiency improvements while meeting increasingly strin- selected and combined by means of a majority voter to form
gent emissions standards. The engines of the future will a system, which achieved 100% generalization to the test
require significantly more complex control than existing set. This performance is attributable to a reliance on the
map-based control strategies, having many more degrees of software engineering concept of diversity. Following
freedom than those of today [72]. Neural network-based experimental evaluation of methods of creating diverse
engine modeling offers the potential for a multi-dimen- neural networks solutions, it was concluded that the best
sional, adaptive, learning control system that does not results should be obtained when data is taken from two
require knowledge of the governing equations for engine different sensors (e.g. a pressure and a temperature sensor),
performance or the combustion kinetics of emissions or where this is not possible, when new data sets are created
formation that a conventional map-based engine model by subjecting a set of inputs to non-linear transformations.
requires. The application of a neural network to model the According to the authors, these conclusions have far-
output torque and exhaust emissions from a modern heavy- reaching implications for other neural network applications.
duty diesel engine (Navistar T444E) is shown to be able to
predict the continuous torque and exhaust emissions, for the Modeling of the volumetric efficiency of IC engines
federal heavy-duty engine transient test procedure (FTP) The volumetric efficiency represents a measure of the
cycle and two random cycles to within 5% of their measured effectiveness of an air pumping system, and is one of the
values after only 100 min of transient dynamometer most commonly used parameters in the characterization and
training. Applications of such a neural network model control of four-stroke IC engines. Physical models of
include emissions virtual sensing, on-board diagnostics and volumetric efficiency require the knowledge of some
engine control strategy optimization [72]. quantities usually not available in normal operating
conditions [76]. Hence, a purely black-box approach is
often used to determine the dependence of volumetric
Development of a knock detector efficiency upon the main engine variables, like the
The worldwide demands for reasonable fuel consump- crankshaft speed and the intake manifold pressure. In this
tion and reduced engine emissions force engine developers work, various black-box approaches for the estimation of
to improve the combustion process. Automobile makers are volumetric efficiency are reviewed, varying from parametric
expecting to meet these demands by increasing the engine (polynomial-type) models, to non-parametric and AI
compression ratio. However, this improvement leads back techniques, like additive models, radial basis function
to the problem of engine knock and appropriate engine neural networks and MLPs. The benefits and limitations of
control. Thus, the ability to detect engine knock with high these approaches are examined and compared. The problem
precision will be mandatory for car makers. A detection considered here can be viewed as a realistic benchmark for
scheme that consists of two main blocks, multi-feature different estimation techniques [76]. The performance of the
extraction and neural classification is proposed by Ortmann different identification methods, as measured by the sum of
and Glesner [73]. A short overview of the concept and a square of residuals (SSR) on the validation data set and the
developed constructive learning algorithm for the cycle-by- corresponding standard deviation (SD) of residuals are
cycle detection task are presented. presented in Table 6.
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 551

Table 6 and to investigate its advantages and disadvantages. The


Performance of the different identification methods [76] control of combustion parameters by neural networks, and
especially the learning characteristics of the networks when
Identification method SSR SD of residuals
the control is adopted as on-line control, i.e. adaptive control
is presented. As a result, it was found that neural networks
Additive model 0.0076 0.0070
Polynomial model 0.0107 0.0083
are more suitable for on-line control for car use than the
RBF (eight neurons) 0.0071 0.0068 table looking-up method.
ML (five neurons) 0.0069 0.0067
Dynamic modeling of a gas engine
Performance of a spark-ignition engine Korb et al. [79] presented a RBFN used to obtain a model
of a gas engine, an unstable two-input/single-output system,
Mapping the performance of an IC engine over a wide
to be used for the design of the speed control system. The
range of operating conditions is a common procedure during
RBFN-centers are chosen using the stepwise orthogonaliza-
development. The generation and post-processing of the
tion algorithm, and an input space compression, which helps
data are high-cost activities. Two approaches which offer
to avoid sparse data sets. The influence of noisy data is
advantages over parametric test plans have been investi-
investigated in a non-linear system example, in order to find
gated in Ref. [77]. A statistically designed matrix of tests
the cause of the model errors in the case of the gas engine
has been employed to map engine stability and combustion
model. The quality of the non-linear RBFN-model is
performance parameters. This approach minimizes the
demonstrated by comparing measured and simulated data.
number of tests required and post-processing techniques
provide valuable insight to relationships, which exist
7.3. Applications of genetic algorithms
between variables. This is particularly useful and efficient
when qualitative trends are of prime interest. When large
GAs have been used in applications in both combustion
data sets are necessarily acquired and quantitative relation-
and IC engines. Table 7 lists application examples of the use
ships between variables are of particular concern, then data
of GAs in these two areas, which are analyzed in this paper.
processing using neural networks is shown to be an effective
approach. The use of this technique is illustrated by
7.3.1. Combustion
application to evaluate relationships between engine-out
Combustion process optimization: reduction of NO2
emissions and engine state variables [77].
emissions via optimal postflame process
A novel approach to optimize combustion processes by
Optimization control for combustion parameters GAs is presented by Homma and Chen [80]. GAs are robust
of petrol engines for complex systems and do not require information on the
The present control method for combustion parameters gradient of the solution space. This feature enables them to
such as spark ignition timing, of a car petrol engine is find the optimal solution for highly non-linear systems
feedforward control, which is called the ‘table looking-up’ typically encountered in combustion process optimization.
method [78]. In this work, the author attempted to replace GAs were used for optimizing the operation conditions of a
the conventional table looking-up method with the neural transient well-mixed reactor to reduce NO/NO2 conversion
networks method, which is a new technique for explaining in postflame processes. Two different scenarios were
such non-linear characteristics as combustion parameters, considered a pure cooling process, and a combined process
Table 7
Summary of applications of genetic algorithms in combustion and internal combustion engines

Number Authors Reference Year Area Subject

1 Homma and Chen [80] 2000 Combustion Combustion process optimization for the reduction of NO2
emissions
2 Harris et al. [81] 2000 Combustion Optimization of reaction rate parameters for chemical
kinetic modeling
3 Chakravarthy et al. [82] 2000 Combustion Predictive emission monitors for NOx generation in
process heaters
4 Fokarty and Bull [83] 1995 Combustion Optimizing individual control rules and multiple
communicating rule-based control systems
5 Polifke et al. [84] 1998 ICE Optimization of rate coefficients for simplified reaction
mechanisms
6 Glielmo and Santini [85] 2001 ICE Model of three way catalytic converters during the
warm-up phase
552 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

of dilution and cooling. For the pure cooling process, a with any confidence and, subsequently, accurately predict
staged cooling with two periods of constant temperature was emission characteristics, stable species concentrations and
found to yield the minimal NO2 emission. For the combined flame characterization. Such predictive capabilities are of
dilution-cooling problem, a staged dilution-cooling process paramount importance in a wide variety of industries.
was found to be optimal. Moreover, the dilution process and
the cooling process are best carried out sequentially. The Predictive emission monitors (PEMS) for NOx generation in
two optimal processes obtained for the two different process heaters
scenarios were shown by the authors to share the same Worldwide, there is an ever-increasing interest and
retention conditions on an air ratio-temperature diagram. concern about the destructive effects of air pollution on the
The results strongly support the idea of using GAs for ecological system. The growing awareness of these effects
designing optimal combustion processes and exploring new has revealed the need to take adequate measures to monitor
combustion technologies. and control the emissions of air pollutants. Process heaters
contribute a major percent to the industrially formed
Optimization of reaction rate parameters for chemical emissions, particularly of NOx and CO. The conventional
kinetic modeling of combustion approach was to monitor these emissions using on-line
A general inversion procedure for determining the analyzers on a regular basis called continuous emission
optimum rate coefficients for chemical kinetic schemes monitors (CEMS). PEMS have been proven to be as
based upon limited net species production data is presented accurate as the CEMS and are in fact more economical
by Harris et al. [81]. The objective of the optimization from the cost and maintenance point of view. A PEMS
process is to derive rate parameters such that the given net developed based on the emission data collected on a pilot
species production rates at various conditions are simul- plant furnace is presented in Ref. [82]. A schematic of the
taneously achieved by searching the parameter space of the pilot plant furnace on which PEMS studies were carried out
rate coefficients in the generalized Arrhenius form of the is shown in Fig. 24. The pilot plant is a refinery modular
reaction rate mechanisms. Thus, the goal is to both match unit and consists of a box-type radiant section with a
the given net species production rates and subsequently vertical up-shot burner. Hot flue gas from the radiant
ensure the accurate prediction of net species production section enters the outboard convection bank where the
rates over a wide range of conditions. The reaction rate data process fluid is preheated before entering the furnace. The
were retrieved using an inversion technique whose mini- convection bank is mounted over a plate-type air preheater,
mization process is based on GAs. The results presented are which is used to supply hot air for combustion by
based upon the recovery of reaction rate coefficients for exchanging heat with the flue gases. Hot oil used as the
hydrogen/nitrogen/oxygen flames. The successful identifi- process fluid is pumped through the convection bank and
cation of the reaction rate parameters which correspond to later passes through the radiant section. The fluid is cooled
product species measurement data from a sequence of such first in an air cooled exchanger and later in a water cooled
experiments clearly suggests that the progression onto other exchanger before going to the surge tank. The furnace oil
chemical kinetic schemes and the optimization of higher- used as fuel is heated and pumped to the burner for
order hydrocarbon schemes can now be realized. The results combustion. The fuel handling facility consists of a day
of this study therefore demonstrate that the GA inversion tank and a pumping and pressure control system. The firing
process promises the ability to assess combustion behavior and operation of the furnace is controlled with the help of a
for fuels where the reaction rate coefficients are not known burner management system. Preheated combustion air is

Fig. 24. Schematic of the furnace pilot plant [82].


S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 553

supplied with the help of a forced draft fan. An and ease of use make the approach presented particularly
arrangement exists for the control of excess air and air appropriate for the numerical modeling of combustion in
preheat temperature in the furnace. The draft in the furnace situations of technical interest.
is maintained with the help of an induced draft fan. A
chimney of sufficient height has been provided for
Model of three way catalytic converters during the warm-up
operation in natural draft mode.
phase
A special arrangement has been made to draw a sample
New regulations for emission control require the
from the arch level through a stainless steel tube inserted at
improvement of the system composed by spark ignition IC
the arch level. The sample passes through a sample
conditioner and through a T-junction to both combustion engine and three-way catalytic converter (TWC) [85]. In
analyzer (for measuring O2, CO, CO2) and NOx analyzer. particular, an important problem is to minimize harmful
The analyzer has a provision to give ppm (parts per million) emissions during the transient warm-up phase where the
of NO and ppm of NOx formed separately [82]. The NOx TWC is not working yet and, hence, a large amount of
kinetic parameters were tuned using a heuristic optimizer pollutants are emitted in the air. Toward this goal a
GA, which minimizes the least squared error between the dynamical thermochemical TWC model simple enough for
model and experimental data. The model thus tuned could the design and test of warm-up control strategies is
be used to predict NOx, CO and CO2 emissions with presented by Glielmo and Santini [85]. The model is
reasonable accuracy as also for model predictive control of obtained through art asymptotic approximation of a more
these emissions. detailed model, i.e. by letting the adsorption coefficient
between gas and substrate tend to infinity. Further, the
Optimizing individual control rules and multiple authors presented a fast integration algorithm based partly
communicating rule-based control-systems on a method of linear space-discretization, partly on the
GAs can be used to optimize either individual process- method of characteristics for quasi-linear hyperbolic partial
control rules or complete rule-based controllers. The differential equations, the separation being allowed by a two
optimization of individual rules to control combustion in time scale analysis of the system. The model has been
multiple burner installations is presented by Fogarty and identified, through a purposely designed GA, and validated
Bull [83]. To solve more complex problems where more on experimental data.
than one rule base is necessary a method of optimizing
multiple communicating rule-based control systems with 7.4. Applications of fuzzy logic
distributed GAs working in parallel is proposed. The
method is demonstrated on a track-following task using Fuzzy logic has been used in applications in both
two communicating rule bases to control a vehicle. combustion and IC engines. It has been used mainly in
controller design as described in the applications that
7.3.2. Internal combustion engines follow. Table 8 lists several representative examples of the
Optimization of rate coefficients for simplified reaction use of fuzzy logic in combustion and IC engines, which are
mechanisms analyzed in this paper.
A general procedure for determining optimum rate
coefficients of simplified kinetic mechanisms is presented
by Polifke et al. [84]. It should be noted that the flows in gas 7.4.1. Combustion
turbine combustors are highly turbulent. The objective of Statistical evaluation of PCDD/PCDF emission data for
the optimization is to match heat release or net species solid waste combustion
production rates of the simplified and an underlying detailed An advanced statistical analysis technique using the
kinetics mechanism. A GA, with a population size of 24 and fuzzy clustering method was employed by Samaras et al.
64 children per generation, is employed to carry out the [86] for the evaluation of PCDD/F emissions during solid
matching procedure with a minimum requirement of human waste combustion. In addition, this technique was applied
effort and expertise. The GA typically ceases to make for the assessment of the effect of an inhibitor (urea) on the
progress after several hundred generations, consuming a few toxic compound releases and on the various isomer
minutes of CPU time on a modern desktop computer. distributions. Municipal solid wastes (MSW) were com-
Applications of optimized two- and three-step schemes to busted in a lab-scale reactor and the toxic gas emissions
lean-premixed laminar methane flames show very promis- were measured at the unit outlet. Combustion tests of urea –
ing results. Profiles of temperature and main species and the fuel mixtures were classified in the same group, indicating
peak values of intermediate species match those obtained that urea affected the formation mechanisms of toxic gases.
with the detailed kinetic mechanism very well thus flame Combustion tests of single fuel were not included in the
speeds have been reproduced with good accuracy. Accord- same group. Furthermore, urea ability to modify the gas
ing to the authors, the optimized mechanisms have proven to emissions pathways was not affected by the method of its
be numerically robust and efficient, whereas the flexibility addition to the fuel.
554 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

Table 8
Summary of applications of fuzzy logic in combustion and internal combustion engines

Number Authors Reference Year Area Subject

1 Samaras et al. [86] 2001 Combustion Statistical evaluation of PCDD/PCDF emissions data for
solid waste combustion
2 Li and Chang [87] 2000 Combustion Combustion control of stocker-fired boilers
3 Kiriakidis et al. [88] 1999 Combustion Control of gaseous processes fuel combustion
4 Fujii et al. [89] 1997 Combustion Control for reducing emissions from flue gas of refuse
incineration furnace
5 Chang and Wang [90] 1996 Combustion Optimal planning of solid-waste management systems
6 Shiono et al. [91] 1993 Combustion Stable combustion in sludge melting furnace
7 Suzuki et al. [92] 1992 Combustion Statistical analysis of dynamics of a rotary kiln sewage-
sludge incinerator
8 Owens and Segal [93] 2001 ICE Air temperature controller for a supersonic combustion
test facility
9 Trebi-Ollenu et al. [94] 2001 ICE Throttle control for an all-terrain vehicle
10 Zilouchian et al. [95] 2000 ICE Controller of a jet engine fuel system
11 Wang et al. [96] 2000 ICE Ignition timing control for a spark ignition engine
12 Laukonen et al. [97] 1995 ICE Fault detection and isolation for an experimental
IC engine
13 Nam et al. [98] 1994 ICE Control of gasoline fuel-injection system
14 Matsumoto et al. [99] 1994 ICE Internal combustion engines control

Combustion control of stoker-fired boilers and membership functions during system operation. The
Combustion control of an industrial stoker-fired boiler is proposed FUZZY P þ ID controller is constructed by using
used to provide a continuous supply of steam at the desired an incremental FLC in place of the proportional term in the
condition of pressure. Because no efficient mathematical conventional PID controller. The basic idea was to reduce
model of the stoker-fired boiler is available, it would be very the parameters of a fuzzy controller to be tuned so that, in
hard to design its controller by using any traditional model- comparison with the PID-type controller, only one
based method. A hybrid fuzzy logic proportional plus additional parameter should be adjusted. The proposed
conventional integral-derivative (FUZZY P þ ID) control- hybrid FUZZY P þ ID controller is applied to several
ler to improve the control performance yielded by the PID- stoker-fired boilers.
type controller is presented by Li and Chang [87]. The basic The real operation results show that the FUZZY P þ ID
combustion control scheme in Fig. 25 shows that the controller is more efficient and robust than the PID-type
regulating multiple variables are coal and air flow. The controller. Since the combustion systems of the stoker-fired
control application design of such units until today has been boilers have a large time lag and controllers’ parameters are
based almost entirely on the skill and intuition of tuned during the boiler’s operation, engineers hope that a
experienced utility boiler control application engineers. few parameters of the controller should be adjusted. For this
The use of what is known as ‘model-based control’ methods reason, the fuzzy inference rules of the proposed FUZZY
to solve the complex and interactive control problems of P þ ID controller are only 3 £ 3 in dimension. Therefore,
boiler systems has not been used to any significant extent this FUZZY P þ ID controller is practical for improving
because of high non-linearity and uncertainty of the boiler control performance of an industrial plant which is already
system. controlled by a conventional PID-type controller.
At present, conventional PID-type controllers are most
widely used in control of industrial stoker-fired boilers due
Control of gaseous processes of fuel combustion
to their simple control structure, ease of design and
inexpensive cost. However, these PID controllers cannot The active control problem of gaseous processes such
yield a good control performance due to high non-linearity as primary air atmospheric-suction and forced-draft
and uncertainty of the boiler systems. Furthermore, when a supply of air for fuel combustion is addressed by
strong load change or a large disturbance exists, the PID-
type controller might be out of control so that a manual
control must be operated.
It was reported about 20 years ago that a FLC is very
suitable for a controlled object with non-linearity and even
with unknown structure. Since solid fuels-coal causes a
large time lag, it is laborious to find manually fuzzy rules Fig. 25. Combustion control scheme of a stocker-fired boiler.
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 555

Kiriakidis et al. [88]. The objective is to regulate gas the waste streams as well as the stability of the secondary
velocity at particular locations within the system, so that material market may result in additional difficulties in
appropriate volume flow rate is achieved. Using modal management decision making. A non-linear fuzzy goal
expansion and treating the high-order modes as unmo- programming approach for solving such problems is
deled dynamics, the governing law of momentum presented by Chang and Wang [90]. In particular, it was
conservation reduces to a finite set of ordinary differen- demonstrated how fuzzy, or imprecise, objectives of the
tial equations. Due to variations of the gas properties decision makers can be quantified through the use of specific
with operating conditions, parametric uncertainty in the membership functions in various types of management-
obtained reduced-order model exists. Moreover, inclusion planning scenarios.
of the fan characteristic and actuator dynamics introduces
additional uncertainty and non-linearity in the model. To
avoid relying on estimation of parameters that vary with Stable combustion in sludge melting furnace
operating conditions or on conservative bounds on the Combustion control using fuzzy logic for a sewage
uncertainty, the proposed controller has variable structure sludge-melting furnace is presented by Shiono et al. [91].
with adaptive switched gain. A fuzzy-logic-based infer- The sewage sludge melting process is gaining attention as an
ence engine realizes the adaptive law that tunes the effective method for reducing sludge volume. The slag from
switched gain to the smallest value that verifies the molten sludge is stable, causes no pollution, and can be used
sliding condition. In effect, this novel design reduces as a by-product. However, the operation of a sludge melting
the tendency and magnitude of chattering, a drawback of plant requires expert operators who have to deal with
conventional sliding control. The fuzzy logic sliding complex phenomena that take place in a melting furnace. In
controller is tested on a prototype air-handling unit, an effort to facilitate the operation, the authors have
compared with proportional-integral (PI) control, a
employed a control system, which features the use of
standard for such applications. According to the authors,
fuzzy logic for automatic combustion control in a cyclone
the advocated controller overshoots less to square-wave
type sewage sludge-melting furnace. According to the
and tracks accurately, in the steady-state higher order
authors this fuzzy logic combustion control system func-
inputs. Further experimental investigation demonstrates
tioned effectively for controlling the furnace temperature,
robustness to structured and unstructured uncertainty.
O2 concentration in the furnace outlet flue gas, and
concentration of NOx.
Control for reducing both CO and NOx from flue gas
of refuse incineration furnace
A new combustion controller that reduces both CO and Statistical analysis of dynamics of a rotary-kiln
NOx concentrations in the exhaust gasses of municipal sewage-sludge incinerator
refuse incineration furnaces is presented by Fujii et al. [89].
Rotary kiln sewage sludge incinerators play an important
The manipulation of the cooling airflow rate is important
because it affects the furnace temperature and the NOx and role in the sewage treatment system. However, a practical
CO concentrations in the exhaust gasses. The new controller control system has not yet been developed because of the
manipulates the cooling airflow rate. The controller complexity of the process, including large disturbances of
monitors the furnace temperature and the NOx and CO inputs and lack of accurate instruments for measuring state
concentrations as process variables and then operates the variables. An interactive modeling method based on fuzzy
cooling airflow rate as a manipulated variable. This multi- set theory is applied to modeling and analyzing the dynamic
input/single-output controller was designed using a fuzzy performance of the process as described in Ref. [92]. In
control algorithm and tested on site at a municipal refuse fuzzy modeling, the process dynamics are represented by a
furnace. According to the authors, the results of the on-site number of IF – THEN rules comprised of fuzzy variables in
test were quite successful. the premise (IF) part and an ARX (auto regressive with
exogenous input) model in the consequence (THEN) part.
Optimal planning of solid-waste management systems Each rule claims local characteristics of the process in each
The emphasis on waste reduction and recycling require- sub-space. By integrating these sub-models, the process
ments prior to incineration and the promulgation of good performance can be predicted more correctly. A fuzzy
combustion practice for emission control of trace organic model was identified by using time-series data after various
compounds during incineration have created conflicting statistical examinations. The model accuracy was evaluated
solid-waste management goals [90]. Questions like, to what by an error index and computer simulations, and the model
extent are recycling and incineration compatible, and what was compared with a conventional time-series model. The
are the subsequent economic impacts on the private and results showed that the fuzzy model is superior in describing
public sectors under specific management scenarios are the process performance. According to the authors, fuzzy
most critical in system planning. However, the inherent modeling is also expected to be an effective method for
complexity of composition, generation, and heat value of other ill-structured processes.
556 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

Fig. 26. Schematic of the pressure tank system [95].

7.4.2. Internal combustion engines Lyapunov function was employed in the adaptive law
Air temperature controller for a supersonic combustion synthesis to ensure convergence.
test facility
The high speed core flow of a hypersonic air-breathing Controller for a jet engine fuel system
engine reduces the available residence time of the fuel within The design, implementation and evaluation of two
the combustor. This results in the primary technological types of FLC are presented by Zilouchian et al. [95]. The
challenge facing development of supersonic combustion system under consideration is the control of fuel delivery
engines, the achievement of efficient mixing and combustion in a jet engine test bench. The system under consideration
within a supersonic air stream for reasonable size is an extension of a jet engine fuel system test bench,
combustors. whose purpose is to simulate the jet engine’s combustion
In this work hydrogen gas is burned in air to raise pressure. The process diagram is shown in Fig. 26. Fuel
and maintain the stagnation temperature of a supersonic flow is sprayed into the tank through fuel nozzles. The
combustion test facility to a desired setpoint [93]. In level of the fluid in the tank is controlled by a controller
order to reach the desired operating conditions for as shown. The pressure is regulated by a digital engine
stagnation temperature, there are three phases to the model in real time, which is running on a mainframe
hydrogen control: H2 ignition at facility start-up, H2 computer. The same computer is also utilized to execute
ramp-up while the facility is ramped-up, and H2 iteration the pressure control which is designed with sampling rate
to achieve the desired temperature setpoint. Each phase of 10 ms. Combustor pressure is an important parameter
incorporates a different type of control. Fuzzy logic is in the fuel system stability and needs to be controlled to a
used to design a computer based supervisory controller ^1% error tolerance for the steady-state operation, and a
that recognizes the different phases of operation and ^2.5% error band for the transient operation. Without
chooses the appropriate control method [93]. these tight specifications many of the tests that are run to
clear fuel system components for jet engine operation are
considered invalid.
Throttle control for an all-terrain vehicle The system functions as follows. Fuel flow is sprayed
An adaptive fuzzy throttle control for an all-terrain into the tank through fuel nozzles as would be the case on
vehicle (ATV) powered by IC engine is presented in Ref. the engine into the combustor. An analog controller
[94]. The design objective was to provide smooth throttle controls the level of the fluid in the tank. There are two
movement and zero steady-state speed error, and to maintain electro-hydraulic servo valves that are used to control the
a selected vehicle speed over varying road slopes for a 2 – pressurization and venting of the tank (depressurization).
30 mile/h speed range. Unlike modern production vehicles, The pressurizing valve is hooked to a constant 80 bar
which have microprocessor-based engine management nitrogen gas source and the vent valve is tied to a vacuum
systems, the ATVs engine is mechanically controlled via a pump to aid in depressurization of the tank. All of the above
carburetor. According to the authors, a complete mathemat- actions are included in the model with the addition of all
ical model of the engine is not available, making it very plumbing volumes as measured from the actual system. The
difficult to apply conventional control techniques. Using pressure in the tank is read by a pressure transducer and is
experience and data collected from extensive experiments fed back to the computer.
conducted on the ATV throttle mechanism, an adaptive Two inputs and one output are defined for the controller
fuzzy throttle control algorithm was designed. A candidate as shown in Fig. 27. The two inputs to the fuzzy controller
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 557

Fig. 27. Configuration of closed loop control system.

Fig. 28. Membership functions [95].

(antecedents) are ERROR and DELTA ERROR, which are Two methods of designing FLCs were tried. The first
defined as follows [95]: method included the development of a tool that inputs the rules
and membership functions and outputs the appropriate
ERROR: the set pressure point minus the actual pressure consequences. The second method was based on bivariate
output. curve development and scaling. Triangular fuzzy membership
DELTA-ERROR: the difference between the present and functions were defined for the first method as shown in Fig. 28
previous error. with their associated linguistic variable names. Table 9,
represents the fuzzy rule-base for the control of process.
The consequence part of the fuzzy rule base system is the In the second method, the control surface is utilized to
control action (output). With inputs and outputs defined as design the desired compensators. For this system two
above, the next step is to partition inputs/output over their control surfaces were used, one for ‘coarse’ control and one
respective universe of discourse into fuzzy membership for ‘fine’ control. The coarse control was used to ‘slam’ the
functions and assign appropriate fuzzy linguistic variables. valves when large errors were presented regardless of the
In simple term, overlapping fuzzy sets (membership change of the error. This action allowed the system to react
functions) can be created over the entire range of given quickly to hard accelerations and decelerations. The fine
inputs and outputs. For the proposed FLC design, seven controller was utilized to capture steady state and meet
membership functions with the following linguistic vari- steady-state requirements.
ables are chosen: The evaluations of the proposed controllers were
performed with an existing PI controller. Both of the design
NL: negative large,
NM: negative medium, Table 9
Fuzzy rule-base controller design for the controller [95]
NS: negative small,
Z: zero, Error Delta error
PS: positive small,
PM: positive medium, NL NM NS Z PS PM PL
PL: positive large.
NL NL NL NL NL NM NS Z
Similarly, membership functions are selected for both NM NL NL NL NM NS Z PS
inputs and outputs, with the exception of limited NL and PL. NS NL NL NM NS Z PS PM
In order to improve the desired performance, both input and Z NL NM NS Z PS PM PL
PS NM NS Z PS PM PL PL
output membership functions were scaled between a
PM NS Z PS PM PL PL PL
[21, þ 1] interval. The proper scaling may be applied PL Z PS PM PL PL PL PL
prior to and after the execution of the fuzzy controller.
558 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

methodologies were proven to be superior in comparison Sliding-mode control of gasoline fuel-injection system with
with the conventional controller currently utilized for the oxygen sensor
control of combustion pressure on jet engines. The fuzzy ‘sliding mode’ control method is proposed by
Nam et al. [98] for the design of a closed-loop fuel-injection
Ignition timing control for a spark ignition engine system. A fuzzy logic control using variable structure
The optimum ignition timing, which gives the maximum system is combined with current oxygen sensors to maintain
brake torque for a given engine design, varies with the rate of the stoichiometric air-to-fuel ratio. The fuzzy cell technique
flame development and propagation in the cylinder [96]. This is also proposed to implement the designed controller on
depends, among other factors, on engine design and non-fuzzy processors. The proposed method offers good
operating conditions, and on the properties of the air – fuel potential for the design of an engine controller since bias
mixture. In modern engines, the ignition timing is generally and large chattering errors due to the hysteresis of the
controlled by fixed open-loop schedules as functions of oxygen sensor can be reduced by utilizing fuzzy logic
engine speed, load and coolant temperature. It is desirable control inside the boundary layer of the sliding mode.
that this ignition timing can be adjusted to the optimum level Additionally, robustness and stability are analytically
producing the best torque to obtain minimum fuel consump- guaranteed by the hybrid control structure. Simulation
tion and maximum available power. Wang et al. [96] studies based on a typical engine-only model demonstrate
presented ignition timing control system based on fuzzy the usefulness of the proposed method.
logic theory. A pressure sensor system was developed for the
determination of combustion parameters and ignition control Internal combustion engines control
on a Ford 1600 cm3 four-cylinder engine fuelled with natural
gas. Several tests were carried out in optimizing the pressure The characteristics of engine approximate load systems
detection system. The results obtained provide important are considerably non-linear [99]. Therefore, when feedback
information compatible with intelligent control of the engine control is required in such a case as a diesel engine governor
using fuzzy logic technology. Moreover, tests carried out or cruise control for a gasoline engine, a countermeasure
using this technology show good results that fit quite well such as gain scheduling must be incorporated to compensate
with the original engine output torque characteristics. this non-linearity. For this purpose a fuzzy control, which is
very robust against non-linearity, is developed. The fuzzy
control was applied to the governor of a diesel engine, and
Fault detection and isolation for an experimental internal the level of robustness of fuzzy control was investigated in
combustion engine comparison with PI and LQI control methods [99]. As a
Certain engine faults can be detected and isolated by result, it was found that fuzzy control showed the highest
examining the pattern of deviations of engine signals from level of robustness and a much faster control response
their nominal unfailed values. Laukonen et al. [97] show compared with the two other methods of control.
how to construct a fuzzy identifier to estimate the engine
signals necessary to calculate the deviation from nominal
engine behavior, so that to determine if the engine has 7.5. Applications of neuro-fuzzy logic
certain actuator and sensor calibration faults. The fuzzy
identifier is compared to a non-linear (ARMAX) autoreg- Fuzzy logic controllers (FLC) with ANN rule adaptation
ressive moving average with exogenous signal, technique have been used in applications in both combustion and IC
and provides experimental results showing the effectiveness engines. Table 10 lists several representative examples of
of the fuzzy identification based failure detection and the use of neuro-fuzzy logic in combustion and IC engines,
identification strategy. which are analyzed in this paper.

Table 10
Summary of applications of neuro-fuzzy logic in combustion and internal combustion engines

Number Authors Reference Year Area Subject

1 Ikonen et al. [100] 2000 Combustion Modeling of power plant flue-gas emissions
2 Li and Chang [101] 1999 Combustion Controller of a stocker-fired boiler
3 Wang et al. [102] 2001 ICE Ignition timing control for a natural gas fueled spark ignition engine
4 Du et al. [103] 2001 ICE Reconstruction of cylinder pressure from vibration signals
5 Yang et al. [106] 2002 ICE Optimum control of ignition timing and injection system in an in-
cylinder injection type hydrogen fueled engine
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 559

7.5.1. Combustion units instead of a single centralized analytical model. The


Modeling of power plant flue-gas emissions input interface translates the data coming from the modeling
Process modeling using fuzzy neural networks is environment into a format that fits the conceptual level of
presented by Ikonen et al. [100]. In Fluidised-bed Combustor the intended model. The interface is formed via a collection
(FBC), the combustion chamber contains a quantity of finely of fuzzy sets and relevant matching mechanisms, producing
divided particles such as sand or ash. The combustion air modeling landmarks. Individual logic processors describe
entering from below lifts these particles until they form a the logical interactions between modeling landmarks, and
turbulent bed, which behaves like a boiling fluid. The fuel is the interface converts the results back into numerical format.
added to the bed and the mixed material is kept in constant Fig. 29 shows a DLP configuration partitioning the
movement by the combustion air. The heat released as output space into three classes. The degrees of memberships
the material burns maintains the bed temperature, and the in these classes are then transformed (defuzzified) into a
turbulence keeps the temperature uniform throughout the real-valued output by taking a weighted average. Compared
bed. The data set used in the experiments described in Ref. to more traditional fuzzy models, the main difference is that
[100] was measured from a 25 MWt semi-circulated FBC with DLPs the rule base is parameterized. The target of
district heating plant, using peat as fuel. Signals were learning is to find out the weightings within the rule base,
measured during both open- and closed-loop process not simply to include or exclude rules or tune the modeling
experiments with a frequency of 0.25 Hz. During the two landmarks (the input – output fuzzy sets). Tuning of fuzzy
days of experiments considered, the process was operated sets is a very efficient way to reduce the prediction error of a
changing the values of primary/secondary air ratio, excess air fuzzy model, and this task is referred to as fundamental and
ratio, and power level. NOx model input signals were selected crucial or dominant over the other possibilities of tuning a
based on a priori knowledge on the conditions affecting the fuzzy inference system. The other possibilities include
formation and reduction of nitrogen oxides in FBC combus- tuning of the inference mechanism or modifications in the
tion. To make it very simple, the availability of oxygen in the rule base.
various stages of the furnace increases the NOx formation, The power of the approach is illustrated with a modeling
while the NOx reduction requires a long enough residence example where NO emission data from a full-scale FBC
time to occur. Also the temperatures in the bed and freeboard district heating plant are used. Several DLP models were
play an important role in the NOx formation and reduction. experimented using the FBC data. In all models, each of the
In distributed logic processors (DLP) the rule base is input and output spaces were partitioned into three fuzzy
parameterized. The DLP derivatives required by gradient- sets (LOW, MEDIUM, HIGH) using triangular membership
based training methods are given, and the recursive functions. The centers of the outmost sets were placed at the
prediction error method is used to adjust the model maximum and minimum values of the measurement data,
parameters. DLP combine the idea of distributed modeling the middle point at their mean value. The left and right
with logic processors. The goal of distributed modeling is to slopes of the sets were given by the centers of the
build a family of loosely coupled local and logic-oriented neighboring sets.

Fig. 29. Distributed logic processor [100].


560 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

According to the authors, the method presented is Reconstruction of cylinder pressure from vibration signals
general, and can be applied as well to other complex An approach to reconstruct IC engine cylinder pressure
processes. from the engine cylinder head vibration signals, using radial
basis function (RBF) networks is presented by Du et al.
[103]. The relationship between the cylinder pressure and
Controller for a stocker-fired boiler
the engine cylinder head vibration signals is analyzed. An
A key issue in an industrial stoker-fired boiler is the RBF network is applied to establish the non-parametric
design of an efficient and robust controller for its mapping model between the cylinder pressure time series
combustion system, so that the boiler can provide a and the engine cylinder head vibration signal frequency
continuous supply of steam at the desired pressure series. The fuzzy c-means clustering method and the
conditions [101]. However, it is difficult to achieve this gradient descent algorithm are used for selecting the centers
objective by using a model-based approach because of the and training the output layer weights of the RBF network,
high non-linearity and uncertainty of boiler systems. In respectively. The validation of this approach to cylinder
addition, the control performance may also suffer as a result pressure reconstruction from vibration signals is demon-
of strong load changes, large disturbances, large time lags, strated on a two-cylinder, four-stroke direct injection diesel
and so on. A behavior-modeling-based approach for the engine, with data from a wide range of speed and load
design of a neuro-fuzzy controller for the combustion settings. The prediction capabilities of the trained RBF
control of a stoker-fired boiler is presented by Li and Chang network model are validated against measured data.
[101]. In this approach, boiler combustion processes with
unknown structure are modeled by defining three dynamic Optimum control of ignition timing and injection system in
behaviors. According to this behavior ‘templates’, the an in-cylinder injection type hydrogen fueled engine
corresponding fuzzy-logic controllers can be optimized It is already known that the emission characteristic of
off-line. During boiler system operation, the appropriate hydrogen fueled engines are extremely good, when running
fuzzy-logic controller is fired, based on an on-line assess- the engine under lean burn conditions, with excess air ratios
ment of its dynamic behavior. The application results ðl . 2Þ; which lower the NOx emissions [104]. However,
obtained demonstrate the effectiveness and the robustness of there is abnormal combustion in the engine, which is one of
the proposed controller. the factors that has prevented the practical use of the engine.
It is also a common conclusion that abnormal combustion
can be suppressed in the in-cylinder injection type engine
7.5.2. Internal combustion engines
[105]. But, such advantages as suppression of abnormal
Ignition timing control for a natural gas fueled spark combustion, engine power-up and reduction of NOx
ignition engine emission are gained depending on proper injection system
One of the important inputs to a spark ignition engine, and reasonable injection timing, ignition timing and rate of
which affects nearly all engine outputs, is ignition timing hydrogen injection. In the study presented by Yang et al.
[102]. The optimum ignition timing which gives the [106] hydrogen is injected into the cylinder in the late
maximum brake torque for a given engine design varies compression stroke and is ignited by electric spark in a test
with the rate of flame development and propagation in the engine. The research on the performance of hydrogen-fueled
cylinder. Modern engines show ignition timing being engine is carried out under the condition of different ignition
generally controlled by fixed open-loop schedules as timing and injection timing. A solenoid drive type in-
functions of engine speed and load. It is desirable that this cylinder hydrogen injection system was designed. The
ignition timing can be adjusted to the optimum level, which injection system using electronic control was installed in a
produces the best torque to obtain minimum fuel consump- diesel engine modified for operation on hydrogen. As shown
tion and maximum available power. An ignition timing in Fig. 30, hydrogen fuel is supplied by a hydrogen gas
control system based on fuzzy logic and neural network bottle with output pressure 15 MPa, purity 98%. Hydrogen
theories is presented by Wang et al. [102]. A fiber optical gas with pressure 10 MPa is driven into high speed solenoid
sensor system was developed for measurement of the valve set. The amount of hydrogen injection and injection
intensity of the luminous emission, which correlates the timing are decided by the solenoid valve controlled by
combustion pressure and ignition timing control on a Ford computer control system in accordance with the engine
1600 cm3 four-cylinder spark ignition engine fuelled with operation conditions. Ignition is also adjusted and controlled
natural gas. Several engine tests were carried out in by the control system to ensure the optimum ignition.
optimizing the combustion intensity detection system. The Hydrogen fuel with pressure 10 MPa is periodically injected
results obtained provide important information compatible into the combustion chamber near TDC by the injector that
with intelligent control of the engine using fuzzy neural was modified by the one used in diesel engine.
control technology. Moreover, tests carried out with data The test engine was modified from a water cooled, 4-
using this technology show good results that fit quite well cylinder, 4-stroke diesel engine with 96 mm bore and
with the original engine output torque characteristics. 102 mm stroke. The combustion chamber was modified to
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 561

Fig. 30. Block diagram of experimental set-up [106].

reduce the compression ratio from the original value of adjustment. Thus, the performances in the hydrogen engine
18.5:1 to 8.5:1, and the test engine was equipped with spark is optimized in every operating state of the engine.
ignition system and hydrogen injection system. The
optimum relation for low-NOx emission and restraining 7.6. Other hybrid systems
the abnormal combustion in the engine between ignition
timing and fuel injection timing was investigated [106]. Table 11 lists several representative examples of the use
Further, a control system consisting of a fuzzy– neural of hybrid systems in combustion, which are analyzed in this
network controller combined with ignition adaptive (IA) paper.
controller is applied to the engine in order to optimally
control ignition timing, injection timing and cycle amount
7.6.1. Combination of genetic algorithms and fuzzy logic
of hydrogen injection. Engine electronic control systems
Controller design for municipal incinerators
enable a high degree of performance optimization to be
Mass burn waterwall incinerator has been the most widely
achieved through strategy and calibration details in soft-
used technology for processing the municipal solid waste
ware. The traditional open-loop control system, such as
(MSW) streams in many metropolitan regions since 1970
look-up table type has been widely applied in diesel engines
[107]. Not only the heterogeneity of physical composition
and gasoline engines control system. Fuzzy – neural network
and chemical property of the MSW but also the complexity of
(FNN) controller, combining IA controller is proposed in
combustion mechanism would significantly influence
the article as an alternative. The two controllers comprise
the control system that can operate in open loop or close
loop manner. Inputs in the controlling system are speed
signal, load signal, top dead center signal, cold system water
temperature signal and air flow signal. Outputs are ignition
timing, injection timing and the amount of hydrogen
injection. Fig. 31 shows the flow diagram of the master
program. Traditional map of look-up table type is replaced
by a FNN controller in the control system. The FNN is an
improved fuzzy – neural network system. It approximates
the input – output mapping describing the system. It is
known that the network has more precision and conver-
gence, compared to other architectures. The network is a
four-layer one that consists of input layer, membership
function layer, fuzzy rules layer and defuzzified output.
The control system developed by the authors combines
the two controllers effectively through a programmed control
by means of a threshold parameter, which accounts for the
amount of variation, determined in advance. Particularly,
under the condition of great varying rate of engine speed and
some transient states, such as engine start, idle speed,
accelerate and decelerate and so on, only the FNN controller
operates. In stable condition of engine, the FNN controller
and the IA controller operate cooperatively, i.e. the FNN
controller is a general regulator to controlled variation,
combined with the IA controller, to carry out a fine Fig. 31. Flow diagram of the master program [106].
562 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

Table 11
Summary of applications of hybrid systems in combustion

Number Authors Reference Year System Subject

1 Chang and Chen [107] 2000 GA þ Fuzzy Controller design of incinerators


2 Chang and Chen [108] 2000 ANN þ GA Prediction of emissions from incinerators
3 Zhou et al. [109] 2001 ANN þ GA Optimization of low NOx pulverized coal combustion

the performance of waste incineration. As a result, a both Europe and North America. Although the neural
successful operation requires an advanced combustion network model may exhibit better predictive results based
control system for handling many uncertain factors. Con- on the performance indexes of percentage error and mean
ventional automatic combustion control technology was square error, model structure cannot be directly identified
found to be insufficient to handle such a highly non-linear and expressed for illustrating the possible chemical
system. The successful applications of those intelligent mechanism with respect to the PCDD/PCDF emissions.
combustion control technologies, such as fuzzy control However, the tree structured GAs, or so-called genetic
technology, for reducing the operational risk have received programming can rapidly screen out those applicable non-
wide attention in recent years. The study illustrates the linear models as well as identify the optimal system
application potentials of the genetic fuzzy controller that is parameters simultaneously in a highly complex system
particularly designed for improving the traditional fuzzy based on a small set of samples.
control logic with the aid of GAs/genetic programming
techniques [107]. Practical implementation of this method- Optimization of low NOx pulverized coal combustion
ology was assessed by a simulation analysis based on three
A way of optimizing the low NOx combustion using the
representative types of mass burn waterwall incinerators in
neural network and GAs for pulverized coal burned utility
Taiwan. It is indicated by the authors that the quality of
boiler is presented by Zhou et al. [109]. The experiments
combustion control in those three municipal incinerators can
have been carried out in a 600-MW capacity pulverized
be possibly enhanced via the use of genetic fuzzy control
coal-fired, dry bottom boiler with a large furnace of
logic. According to the authors, the application of such a
19.6 £ 16.4 m2 cross-sectional area and 57 m high. The
hybrid fuzzy control technology can be easily extended to
tilting fuel and combustion air nozzles are located in the
many other types of industrial incinerators, such as modular,
corners of the furnace, all nozzles can be tilted in vertical
rotary kiln, and fluidized bed incinerators, by a slightly
direction over about 208 from the horizontal axis, both
different approach.
upwards and downwards to adjust the reheated steam
temperature for the varying fouling conditions of the
7.6.2. Combination of artificial neural networks and genetic furnace. The concentrated firing system is used to combust
algorithms bituminite. The fuel and primary air streams are directed at
Prediction of PCDDs and PCDFs emissions from municipal the circumference of an imaginary circle of 1600 mm
incinerators diameter at the center of the furnace. The lower secondary
The potential emissions of PCDDs/PCDFs from munici- air streams are horizontally deviated a certain angle to the
pal incinerators have received wide attention in the last furnace walls. The upper secondary air and the over fire air
decade. Concerns were frequently addressed by the (OFA) nozzles are directed to the opposite direction of the
scientific community with regard to the aspects of health flow swirl in the furnace, to obtain the effect of slagging-
risk assessment, combustion criteria, and the public prevention, and the low twisting residual at the exit of
regulations. Without accurate prediction of PCDD/PCDF furnace and also the low NOx emission.
emissions, however, reasonable assessment of the health A total of 12 tests have been carried out on this boiler,
risk and essential appraisal of the combustion criteria or changing the boiler load, OFA distribution pattern, second-
public regulations cannot be achieved. Previous prediction ary air distribution pattern, coal quality and nozzles tilting
techniques for PCDD/PCDF emissions were limited by the angle, respectively, to analyze the characteristics of the NOx
linear models based on a least-square-based analytical emission of the tangentially fired system.
framework, such that the inherent non-linear features cannot The NOx emission characteristic of the 600 MW capacity
be explored via advanced system identification techniques. boiler operated under different conditions is experimentally
AI were found useful in this study for the identification of investigated and on the basis of experimental results, the
non-linear structure in relation to the PCDD/PCDF emis- ANN is used to describe its NOx emission property to
sions from municipal incinerators [108]. Examples were develop a neural network based model. Using the neural
drawn from the emission test of PCDDs/PCDFs through the network to model the NOx emission characteristic of the
flue gas discharge from several municipal incinerators in coal fired boiler, the function between input operating
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 563

parameters and the NOx release can then be determined after probability pc ¼ 0:25 and a mutation probability pm ¼
the training process of the neural network. The neural 0:05; also a total of 1500 generations are calculated.
network can give a good answer to an untrained input
pattern. If taking the NOx emission function expressed by
neural network as the fitness function, then the optimum
8. Conclusions
operating parameters can be achieved under various
conditions using the GA.
A number of AI techniques have been described in this
It is well known that the NOx emission characteristic
paper. Logic programs called expert systems allow
of the tangentially fired boiler is affected by complicated
computers to ‘make decisions’ by interpreting data and
factors such as coal quality, boiler load, fuel and air
selecting from among alternatives. ANNs are collections
distribution pattern, oxygen concentration, boiler style,
of small individually interconnected processing units.
pulverized coal fitness, coal concentration of the primary
Information is passed between these units along inter-
air, burner style, etc. The design parameters of the boiler
connections. ANNs while implemented on computers are
that was used have been determined. The adjustable
not programmed to perform specific tasks. Instead, they
parameters are the operating parameters and the coal
are trained with respect to data sets until they learn the
quality.
patterns presented to them. Once they are trained, new
An ANN having 29 input neurons, one output neuron
patterns may be presented to them for prediction or
and 31 hidden neurons is used to model the NOx emission
classification. GAs are inspired by the way living
characteristics of the boiler. For this 600 MW capacity
organisms adapt to the harsh realities of life in a hostile
boiler, the total fuel rate and the total air flow rate are
world, i.e. by evolution and inheritance. The algorithm
employed as the inputs to evaluate the effect of boiler
imitates in the process the evolution of population by
load. Five coal feeders are put into operation under rated selecting only fit individuals for reproduction. Therefore, a
load and a total of five feeder opening values are used to GA is an optimum search technique based on the concepts
consider the effect of the fuel distribution pattern along of natural selection and survival of the fittest. A GA
the furnace height. There are six elevations of secondary utilizes three principal genetic operators: selection,
air nozzles employed. All the secondary air burners’ crossover, and mutation. Fuzzy logic is used mainly in
dampers of the same elevations are at the same opening control engineering. It is based on fuzzy logic reasoning
position. Six damper opening values are used in the which employs linguistic rules in the form of IF – THEN
neural network to consider the effect of the secondary air statements. Fuzzy logic and fuzzy control feature a
distribution pattern. Also two OFA damper opening relative simplification of a control methodology descrip-
values are employed as the network’s input. Four tion. This allows the application of a ‘human language’ to
windbox pressure-measuring points are installed, whose describe the problems and their fuzzy solutions. Hybrid
average output is input to the neural network. At the rear systems combine more than one of the technologies
of the economizer, there are four oxygen-measuring described above, either as part of an integrated method of
points, whose mean output value is used as an input. problem solution, or to perform a particular task that is
The carbon content, the hydrogen content, the oxygen followed by a second technique, which performs some
content, the nitrogen content, the heat value and the other task.
volatile content are employed to evaluate the effect of From the description of the various applications
coal quality on the NOx emission. Under rated load, five presented in this paper, one can see that AI techniques
mills are put into operation. Five air flow rates of the mill have been applied in a wide range of fields for modeling,
are used as inputs to evaluate the effect of primary air prediction and control in combustion processes. What is
distribution pattern along the height of the furnace. All required for setting up such a system is data that
the nozzles are tilted in the same angle, so the neural represents the past history and performance of the real
network has only one input-tilting angle. A GA is system and a selection of a suitable model. The selection
employed to perform a search to determine the optimum of this model is done empirically and after testing
solution of the neural network model, identifying various alternative solutions. The performance of the
appropriate setpoints for the current operating conditions selected models is tested with the data of the past history
and the low NOx emission of the pulverized coal burned of the real system.
boiler is achieved. GAs use two sets of calculating Surely, the number of applications presented here is
parameters to achieve the optimum controllable par- neither complete nor exhaustive but merely a sample of
ameters for lowest NOx emission. The first calculating applications that demonstrate the usefulness and possible
process chooses a population size m ¼ 50; a crossover applications of AI techniques. Like all other approximation
probability pc ¼ 0:8 and a mutation probability pm ¼ techniques, AI techniques have relative advantages and
0:15; a total of 1500 generations are calculated. The disadvantages. There are no rules as to when this particular
second process uses a population size m ¼ 50; a crossover technique is more or less suitable for an application. Based
564 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

on the work presented here it is believed that AI techniques [20] Tsoukalas LH, Uhrig RE. Fuzzy and neural approaches in
offer an alternative method, which should not be engineering. New York: Wiley; 1997.
underestimated. [21] Ripley BD. Pattern recognition and neural networks. Cam-
bridge, MA: Cambridge University Press; 1996.
[22] Ivakhenko AG. Polynomial theory of complex systems. IEEE
Trans Syst, Man Cybern 1971;SMC-12:364– 78.
References [23] Ivakhenko AG. The group method of data handling: a rival of
stochastic approximation. Sov Automat Control 1968;1:
[1] Barr A, Feigenbaum EA, The handbook of artificial 43 –55. [in Russian].
intelligence, vol. 1. Los Altos, CA: Morgan Kaufmann; [24] Farlow SJ, editor. Self-organizing methods in modeling. New
1981. York: Marcel Dekker; 1984.
[2] Rumelhart DE, Hinton GE, Williams RJ. Learning internal [25] Hecht-Nielsen R. Neurocomputing. Reading, MA: Addison-
representations by error propagation. Parallel distributed Wesley; 1991.
processing: explorations in the microstructure of cognition, [26] Zalzala A, Fleming P. Genetic algorithms in engineering
vol. 1. Cambridge, MA: MIT Press; 1986 [chapter 8]. systems. London, UK: The Institution of Electrical Engin-
[3] Jackson P. Introduction to expert systems. Reading, MA: eers; 1997.
Addison Wesley Publishing Company; 1990. [27] Goldberg DE. Genetic algorithms in search optimization and
[4] Pham DT. Expert systems in engineering. Berlin: IFS machine learning. Reading, MA: Addison-Wesley; 1989.
Publications/Springler; 1988. [28] Davis L. Handbook of genetic algorithms. New York: Van
[5] Nebendahi D. Expert systems: introduction to the technology Nostrand; 1991.
and applications. New York: Wiley; 1987. [29] Michalewicz Z. Genetic algorithms þ data structures ¼
[6] Waterman DA. A guide to expert systems. Reading, MA: evolution programs. 3rd ed. Berlin: Springer; 1996.
Addison-Wesley; 1982. [30] Reznik L. Fuzzy controllers. Oxford: Newnes; 1997.
[7] Bonnet A, Haton JP, Truong-Ngoc JM. Expert systems: [31] Zadeh LA. Outline of a new approach to the analysis of
principles and practice. New York: Prentice Hall; 1988. complex systems and decision processes. IEEE Trans Syst
[8] Jackson P. Introduction to expert systems. Harlow, England: Man Cybern 1973;3:28– 44.
Addison-Wesley; 1999. [32] Nie J, Linkens DA. Fuzzy-neural control: princi-
[9] Nannariello J, Frike FR. Introduction to neural network ples, algorithms and applications. Englewood Cliffs, NJ:
analysis and its applications to building services engin- Prentice-Hall; 1995.
eering. Building Serv Engng Res Technol 2001;22(1): [33] Mamdani EH. Application of fuzzy algorithms for control of
58–68. simple dynamic plant. IEE Proc 1974;121:1585–8.
[10] Haykin S. Neural networks: a comprehensive foundation. [34] Mamdani EH. Applications of fuzzy logic to approximate
New York: Macmillan; 1994. reasoning using linguistic synthesis. IEEE Trans Computers.
[11] Pogio T, Girosi F. Networks for approximation and learning. 1977;26(12):1182–91.
Proc IEEE 1990;78:1481–97. [35] Sugeno M. Industrial applications of fuzzy control. Amster-
[12] Neocleous C. A neural network architecture composed of dam: North-Holland; 1985.
adaptively defined dissimilar single-neurons. PhD Thesis. [36] Xu MH, Azevedo JLT, Carvalho MG. Modeling of a
Brunel University, UK; 1998. front wall fired utility boiler for different operating
[13] Werbos PJ. Beyond regression: new tools for prediction and conditions. Comput Meth Appl Mech Engng 2001;
analysis in the behavioral science. PhD Thesis. Harvard 190(28):3581–90.
University, Cambridge, MA; 1974. [37] Cho WS, Roh SD, Kim SW, Jang WH, Shon SS. The process
[14] Neuroshell-2 program manual. Ward Systems Group, Inc., modeling and simulations for the fault diagnosis of rotary
MD; 1996. kiln incineration process. J Ind Engng Chem 1998;4(2):
[15] Kalogirou S, Bojic M. Artificial neural networks for the 99 –104.
prediction of the energy consumption of a passive solar [38] Ranzi E, Faravelli T, Gaffuri P, Sogaro A. Low-temperature
building. Energy 2000;25(5):479–91. combustion: automatic-generation of primary oxidation
[16] Kalogirou S, Panteliou S, Dentsoras A. Modelling of solar reactions and lumping procedures. Combust Flame 1995;
domestic water heating systems using artificial neural 102(1/2):179– 92.
networks. Solar Energy 1999;65(6):335–42. [39] Basu P, Mitra S. Application of an expert system to the design
[17] Kalogirou S, Neocleous C, Schizas C. Artificial neural of a furnace of a circulating fluidized bed boiler. J Engng Gas
networks in modelling the heat-up response of a solar steam Turbines Power: Trans ASME 1994;116(3):462–7.
generation system. Proceedings of the Engineering Appli- [40] Kim SH, Wolfe WE, Hadipriono FC. The development of a
cations of Neural Networks (EANN’96) Conference, knowledge-based expert system for utilization of coal
London, UK; 1996. p. 1–4. combustion by-product in highway embankment. Civil
[18] Kalogirou S. Artificial neural networks for estimating the Engng Syst 1992;9(1):41–57.
local concentration ratio of parabolic trough collectors. Proc [41] Kalogirou S. Applications of artificial neural networks in
EuroSun’96 Conf, Freiburg, Germany 1996;1:470–5. energy systems: a review. Energy Convers Mgmt 1999;
[19] Kalogirou S, Neocleous C, Schizas C. Artificial neural 40(10):1073–87.
networks used for estimation of building heating load. [42] Kalogirou S. Artificial neural networks in renewable energy
Proc CLIMA Int Conf, Brussels, Belgium, Paper systems: a review. Renewable Sustainable Energy Rev 2001;
Number P159;1997:159. 5(4):373–401.
S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566 565

[43] Tronci S, Baratti R, Servida A. Monitoring pollutant [61] Leib TM, Mills PL, Lerou JJ. Fast response distributed
emissions in a 4.8 MW power plant through neural network. parameter fluidized bed reactor model for propylene partial
Neurocomputing 2002;43:3–15. oxidation using feed-forward neural network methods. Chem
[44] Schmidt D, Schmidt D. Online prediction of the free lime Engng Sci 1996;51(10):2189–98.
content in the sintering zone and the use of neural networks [62] Debeda H, Rebiere D, Pistre J, Menil F. Thick-film pellistor
for process optimization. ZKG Int 2001;54(9):471 –9. array with a neural-network post treatment. Sens Actuators,
[45] Chong AZS, Wilcox SJ, Ward J. Prediction of gaseous B: Chem 1995;27(1–3):297–300.
emissions from a chain grate stoker boiler using neural [63] Allen MG, Butler CT, Johnson SA, Lo EY, Russo F. An
networks of ARX structure. IEE Proc Sci Measure Technol imaging neural-network combustion control-system for
2001;148(3):95–102. utility boiler applications. Combust Flame 1993;94
[46] Blonbou R, Laverdant A, Zaleski S, Kuentzmann P. Active (1/2):205–14.
control of combustion instabilities on a Rijke tube using [64] Muller B, Keller H. Neural networks for combustion process
neural networks. Proc Combust Inst 2000;28:747–55. modelling. Proceedings of the International Conference,
[47] Chong AZS, Wilcox SJ, Ward J. Application of neural- EANN’96, London, UK; 1996. p. 87–90.
network-based controller on an industrial chain grate stoker [65] Christo FC, Masri AR, Nebot EM. Utilising artificial neural
boiler. J Inst Energy 2000;73(497):208–14. network and repro-modelling in turbulent combustion.
[48] Chong AZS, Wilcox SJ, Ward J. The development of a neural Proceedings of the IEEE International Conference
network based system for the optimal control of chain-grate ICNN’95, Perth, Western Australia; 1995. p. 911 –6.
stoker-fired boilers. Proc ASME Heat Transf Div 2000; [66] Hafner M, Schuler M, Nelles O, Isermann R. Fast neural
363(3):103– 9. networks for diesel engine control design. Control Engng
[49] Ward J, Wilcox SJ, Payne R. Prediction of the thermal Pract 2000;8(11):1211–21.
performance of a high-temperature furnace using neural [67] Heister F, Froehlich M. Non-linear time series analysis of
networks. IMechE 1999;C565/053:437–42. combustion pressure data for neural network training with the
[50] Ferretti G, Piroddi L. Estimation of NOx emissions in concept of mutual information. Proc Inst Mech Engnr, Part
thermal power plants using artificial neural networks.
D: J Automobile Engng 2001;215(D2):299 –304.
J Engng Gas Turbines Power: Trans ASME 2001;123(2):
[68] Isermann R, Hafner M. Mechatronic combustion engines:
465–71.
from modeling to optimal control. Eur J Control 2001;7(2/3):
[51] Larachi F. Neural network kinetic prediction of coke burn-off
220 –47.
on spent MnO2/CeO2 wet oxidation catalysis. Appl Catal B:
[69] Park S, Yoon P, Sunwoo M. Feedback error learning neural
Environ 2001;30(1/2):141– 50.
networks for spark advance control using cylinder pressure.
[52] Blonbou R, Laverdant A, Zaleski S, Kuentzmann P. Active
Proc Inst Mech Engnr, Part D: J Automobile Engng 2001;
adaptive combustion control using neural networks. Combust
215(D5):625 –36.
Sci Technol 2000;156:25–47.
[70] Li XQ, Yurkovich S. Neural network based, discrete adaptive
[53] Blasco JA, Fueyo N, Dopazo C, Chen JY. A self-
sliding mode control for idle speed regulation in IC engines.
organizing-map approach to chemistry representation in
J Dyn Syst Measure Control: Trans ASME 2000;122(2):
combustion applications. Combust Theory Modeling 2000;
269 –75.
4(1):61–76.
[54] Blasco JA, Fueyo N, Larroya JC, Dopazo C, Chen YJ. A [71] Sharkey AJC, Chandroth GO, Sharkey NE. A multi-net
single-step time-integrator of a methane – air chemical system for the fault diagnosis of a diesel engine. Neural
system using artificial neural networks. Comput Chem Comput Appl 2000;9(2):152–60.
Engng 1999;23(9):1127–33. [72] Thompson GJ, Atkinson CM, Clark NN, Long TW,
[55] Zhu Q, Jones JM, Williams A, Thomas KM. The predictions Hanzevack E. Neural network modeling of the emissions
of coal/char combustion rate using an artificial neural and performance of a heavy-duty diesel engine. Proc Inst
network approach. Fuel 1999;78(14):1755–62. Mech Engnr, Part D: J Automobile Engng 2001;214(D2):
[56] Liu GP, Daley S. Output-model-based predictive control of 111 –26.
unstable combustion systems using neural networks. Control [73] Ortmann S, Glesner M. Development and implementation of
Engng Pract 1999;7(5):591–600. a neural knock detector using constructive learning methods.
[57] Zbicinski I, Smucerowicz I, Stumillo C, Kasznia J, Stawczyk Int J Uncertainty Fuzziness Knowledge-Based Syst 1998;
J, Murlikiewicz K. Optimisation and neural modeling of 6(2):127–37.
pulse combustors for drying applications. Drying Technol [74] Muller R, Hemberger HH, Baier K. Engine control using
1999;17(3):609–33. neural networks: a new method in engine management
[58] Blasco JA, Fueyo N, Dopazo C, Ballester J. Modeling the systems. Mechanica 1997;32(5):423 –30.
temporal evolution of a reduced combustion chemical system [75] Sharkey AJC, Sharkey NE, Chandroth GO. Diverse neural-
with an artificial neural network. Combust Flame 1998; net solutions to a fault diagnosis problem. Neural Comput
113(1/2):38–52. Appl 1996;4(4):218–27.
[59] Christo FC, Masri AR, Nebot EM. Artificial neural network [76] DeNicolao G, Scattolini R, Siviero C. Modeling the
implementation of chemistry with pdf simulation of H2/CO2 volumetric efficiency of IC engines: parametric, non-
flames. Combust Flame 1996;106(4):406–27. parametric and neural techniques. Control Engng Pract
[60] Altug S, Tulunay E. Neural-network-based process controller 1996;4(10):1405 –15.
design and on-line application to fluidized bed combustion. [77] Stevens SP, Shayler PJ, Ma TH. Experimental-data proces-
J Syst Engng 1996;6(3):148–58. sing techniques to map the performance of a spark-ignition
566 S.A. Kalogirou / Progress in Energy and Combustion Science 29 (2003) 515–566

engine. Proc Inst Mech Engnr, Part D: J Automobile Engng [93] Owens M, Segal C. Development of a hybrid-fuzzy air
1995;209(4):297–306. temperature controller for a supersonic combustion test
[78] Morita S. Optimization control for combustion parameters of facility. Exp Fluids 2001;31(1):26–33.
petrol engines using neural networks: in the case of online [94] Trebi-Ollennu A, Dolan JM, Khosla PK. Adaptive fuzzy
control. Int J Vehicle Des 1993;14(5/6):552–62. throttle control for an all-terrain vehicle. Proc Inst Mech
[79] Korb R, Jorgl HP, Lutz B. Nonlinear dynamic modeling of a Engnr, Part I: J Syst Control Engng 2001;215(I3):189–98.
gas engine using an RBF-network. Math Comput Modeling [95] Zilouchian A, Juliano M, Healy T, Davis J. Design of a fuzzy
Dyn Syst 1999;5(2):133–51. logic controller for a jet engine fuel system. Control Engng
[80] Homma R, Chen JY. Combustion process optimization by Pract 2000;8(8):873–83.
genetic algorithms: reduction of NO2 emission via [96] Wang W, Chirwa EC, Zhou E, Holmes K, Nwagboso C.
optimal postflame process. Proc Combust Inst 2000;28: Fuzzy ignition timing control for a spark ignition engine.
2483–9. Proc Inst Mech Engnr, Part D: J Automobile Engng 2000;
[81] Harris SD, Elliott L, Ingham DB, Pourkashanian M, Wilson 214(D3):297 –306.
CW. The optimization of reaction rate parameters for [97] Laukonen EG, Passino KM, Krishnaswami V, Luh GC,
chemical kinetic modeling of combustion using genetic Rizzoni G. Fault-detection and isolation for an experimental
algorithms. Comput Meth Appl Mech Engng 2000;190(8 – internal combustion engine via fuzzy identification. IEEE
10):1065–90. Trans Control Syst Technol 1995;3(3):347 –55.
[82] Chakravarthy SSS, Vohra AK, Gill BS. Predictive emission [98] Nam SK, Kim JS, Yoo WS. Fuzzy sliding mode control of
monitors (PEMS) for NOx generation in process heaters. gasoline fuel-injection system with oxygen sensor. JSME Int
Comput Chem Engng 2000;23(11/12):1649– 59. J Ser, C: Dyn Control Robot Des Manufact 1994;37(1):
[83] Fogarty TC, Bull L. Optimizing individual control rules and 100 –6.
multiple communicating rule-based control-systems with [99] Matsumoto H, Morita S, Takiyama T. Application of fuzzy
parallel distributed genetic algorithms. IEE Proc: Control control to internal combustion engines. JSME Int J Ser B,
Theory Appl 1995;142(3):211 –5. Fluids Thermal Engng 1994;37(1):159 –64.
[84] Polifke W, Geng WQ, Dobbeling K. Optimization of rate [100] Ikonen E, Najim K, Kortela U. Neuro-fuzzy modeling of
coefficients for simplified reaction mechanisms with genetic power plant flue-gas emissions. Engng Appl Artif Intell
algorithms. Combust Flame 1998;113(1/2):119– 34. 2000;13(6):705 –17.
[85] Glielmo L, Santini S. A two-time-scale infinite-adsorption [101] Li W, Chang XG. A neuro-fuzzy controller for a stoker-fired
model of three way catalytic converters during the warm-up boiler, based on behavior modeling. Control Engng Pract
phase. J Dyn Syst Measure Control: Trans ASME 2001; 1999;7(4):469– 81.
123(1):62–70. [102] Wang W, Chirwa EC, Zhou E, Holmes K, Nwagboso C.
[86] Samaras P, Kungolos A, Karakasidis T, Georgiou D, Perakis Fuzzy neural ignition timing control for a natural gas fueled
K. Statistical evaluation of PCDD/F emission data during spark ignition engine. Proc Inst Mech Engnr, Part D: J
solid waste combustion by fuzzy clustering techniques. Automobile Engng 2001;215(D12):1311–23.
J Environ Sci Health, Part A: Tox/Hazard Subst Environ [103] Du H, Zhang L, Shi X. Reconstructing cylinder pressure from
Engng 2001;36(2):153–61. vibration signals based on radial basis function networks.
[87] Li W, Chang XG. Application of hybrid fuzzy logic Proc Inst Mech Engnr, Part D: J Automobile Engng 2001;
proportional plus conventional integral-derivative controller 215(D6):761 –7.
to combustion control of stoker-fired boilers. Fuzzy Sets Syst [104] Jorach R, Enderle C, Decker R. Development of a low NOx
2000;111(2):267 –84. truck hydrogen engine with high specific power output. Int J
[88] Kiriakidis K, Tzes A, Grivas A, Peng PY. Modeling, plant Hydrogen Energy 1997;22(4):423 –7.
uncertainties, and fuzzy logic sliding control of gaseous [105] Dudko DA, Martynova TI. Self-structuring of multicompo-
systems. IEEE Trans Control Syst Technol 1999;7(1): nent materials in hydrogen-thermal treatment. Int J Hydrogen
42–55. Energy 1997;22(2/3):323–8.
[89] Fujii S, Tomiyama S, Nogami T, Shirai M, Ase H, [106] Yang ZZ, Wei JQ, Fang ZY, Li JD. An investigation of
Yokoyama T. Fuzzy combustion control for reducing both optimum control of ignition timing and injection system in an
CO and NOx from flue gas of refuse incineration furnace. in-cylinder injection type hydrogen fueled engine. Int J
JSME Int J Ser C: Mech Syst Mach Elem Manufact 1997; Hydrogen Energy 2002;27(2):213 –7.
40(2):279–84. [107] Chang NB, Chen WC. Fuzzy controller design for municipal
[90] Chang NB, Wang SF. Managerial fuzzy optimal planning for incinerators with the aid of genetic algorithms and genetic
solid-waste management systems. J Environ Engng, ASCE programming techniques. Waste Mgmt Res 2000;18(5):
1996;122(7):649–58. 429 –43.
[91] Shiono S, Usui T, Iriyama M, Suzuki K, Naka Y. Stable [108] Chang NB, Chen WC. Prediction of PCDDs/PCDFs emis-
combustion in sludge melting furnace by fuzzy-logic control. sions from municipal incinerators by genetic programming
Water Sci Technol 1993;28(11/12):347–54. and neural network modeling. Waste Mgmt Res 2000;18(4):
[92] Suzuki K, Iriyama M, Nakamori Y. Statistical analysis of 341 –51.
dynamics of a rotary kiln sewage sludge incinerator using [109] Zhou H, Cen KF, Mao JB. Combining neural network and
fuzzy modeling. Kagaku Kogaku Ronbunshu 1992;18(6): genetic algorithms to optimize low NOx pulverized coal
894–903. combustion. Fuel 2001;80(15):2163– 9.