You are on page 1of 64

SCHOOL of ELECTRICAL ENGINEERING & COMPUTER SCIENCE

FACULTY of ENGINEERING & BUILT ENVIRONMENT


The UNIVERSITY of NEWCASTLE

Comp3330/6380 Machine Intelligence

Course Coordinator: A/Prof. Stephan Chalup

Semester 1, 2014

LECTURE 5
Introduction to
Evolutionary Computation

April 8, 2014
2

OVERVIEW
Aims of this Lecture

Introduction
7
Literature and Software . . . . . . . . . . . . . . . . . . 8
Basic Features of Evolutionary Systems . . . . . . . 9
Representation of Organisms . . . . . . . . . . . . . . . 10
Areas of Evolutionary Computation

11

The Human Genome

12

Genetic Algorithms (GA)


13
Genetic Operators . . . . . . . . . . . . . . . . . . . . . . 20
How a Simple GA Works . . . . . . . . . . . . . . . . . 33
3

Modes of Learning in Natural Evolution

35

EC + ANN
38
EC + ANN Systems . . . . . . . . . . . . . . . . . . . . . 39
Evolutionary Programming (EP)
40
Three Steps of Basic EP . . . . . . . . . . . . . . . . . . 41
Differences between GAs, EP. . . . . . . . . . . . . . . 42
Evolution Strategies
43
Simulated Annealing . . . . . . . . . . . . . . . . . . . . 46
Differences between EP and ES . . . . . . . . . . . . 47
Genetic Programming (GP)

48

Applications of EAs

49
4

Multi-objective Optimisation

53

EC is a Interdisciplinary Research Field

54

Correspondences

56

Three Timescales Model

57

Summary
58
Critical Discussion . . . . . . . . . . . . . . . . . . . . . . 60

Aims of this Lecture


1. Know the basic operators of a genetic algorithm (GA) and how
a basic GA works.
2. Learn about different areas of Evolutionary Computation (EC).
3. Compare biological with algorithmic aspects.
4. Discuss different modes and timescales of learning in evolution.
5. Learn about simulated annealing.
6. Hear about some applications of EC.
7. Understand the interdisciplinary character of EC

c
2013
Chalup

Introduction
1. Computer simulations in the 1960s allowed to analyse systems
that were too complex for mathematical modelling.
2. Evolutionary biologists became interested in models of natural
evolutionary systems.
3. Computer scientists and engineers used them for optimisation
(Rechenberg, 1973; Schwefel, 1995).
4. Artificial life: Design and experiment with artificial worlds.

Literature and Software


1. Books: (De Jong, 2006), (Fogel et al., 1966), (Fogel, 2000),
(Goldberg, 1989), (Holland, 1975), (Mitchell, 1996),
(Rechenberg, 1973), (Schwefel, 1995)
2. Papers: (Chalup and Maire, 1999), (Kirkpatrick et al., 1983),
(Kohl and Stone, 2004), (Yao, 1999)
3. Software: Matlab Global Optimization Toolbox :
Genetic Algorithm Examples
http://www.mathworks.com.au/help/toolbox/gads/f6691.html

c
2012
Chalup

Basic Features of Evolutionary Systems


Inspired by Darwinian evolutionary systems we assume that an
artificial evolutionary system should consists of
1. One or more populations of individuals competing for limited
resources.
2. Dynamically changing populations due to the birth and death of
individuals.
3. A concept of fitness which reflects the ability of an individual to
survive and reproduce.
4. A concept of variational inheritance: offspring closely resemble
their parents, but are not identical.
This leads to processes that starting from some initial conditions
follow a trajectory over time through a complex evolutionary state
space. Can study: Convergence properties, sensitivity to initial
conditions, etc..
9

Representation of Organisms
How can the individuals of a population, i.e. the organisms, be
represented?
One possibility is to represent them as a fixed length vector of
features that are chosen because of their relevance for fitness
evaluation, e.g.:
< hair colour, eye colour, skin colour, height, weight >
Loosely this can be regarded as:
Genotype of an individual that is specified by a chromosome of
five genes, or
Observable physical trait of an individual, i.e. its phenotype.
By specifying the range of values (alleles) the 5 features in the
example might take on, a 5-dimensional space of all possible
genotypes (or phenotypes) in this artificial world is defined.
10
c
2006
(De Jong, 2006)

Areas of Evolutionary Computation


1. Genetic Algorithms (GA) (Holland, 1975; Goldberg, 1989)
2. Evolutionary Programming (EP) (Fogel et al., 1966; Fogel,
2000)
3. Evolution Strategies (ES) (Rechenberg, 1973; Schwefel, 1995)
4. Genetic Programming (GP)
The algorithms of all four areas fall under the umbrella of
Evolutionary Algorithms (EA).

11

The Human Genome


The most precious collection of information
3 billion pieces of data in the form of deoxyribonucleic acid
(DNA)
The individual pieces of information are called nucleotides or
bases.
At least 99.9% of the genome is identical between human
individuals.
0.1% 3 million differences.
Humans have 46 chromosomes arranged in 23 pairs.
DNA is coiled into chromosomes.

12

Genetic Algorithms (GA)


GAs are based on abstractions of the mechanisms of evolution
in nature.

13

Genetic Algorithms (GA)


GAs are based on abstractions of the mechanisms of evolution
in nature.
GAs create a population of individuals represented by
chromosomes which go through a process of simulated
evolution.

14

Genetic Algorithms (GA)


GAs are based on abstractions of the mechanisms of evolution
in nature.
GAs create a population of individuals represented by
chromosomes which go through a process of simulated
evolution.
Arrays (fixed-length) of bits or characters represent the
chromosomes.

15

Genetic Algorithms (GA)


GAs are based on abstractions of the mechanisms of evolution
in nature.
GAs create a population of individuals represented by
chromosomes which go through a process of simulated
evolution.
Arrays (fixed-length) of bits or characters represent the
chromosomes.
Bit manipulation operations allow the implementation of
crossover, mutation and other operations.

16

Genetic Algorithms (GA)


GAs are based on abstractions of the mechanisms of evolution
in nature.
GAs create a population of individuals represented by
chromosomes which go through a process of simulated
evolution.
Arrays (fixed-length) of bits or characters represent the
chromosomes.
Bit manipulation operations allow the implementation of
crossover, mutation and other operations.
Possible application: multidimensional optimisation where the
chromosome encodes the different parameters being optimized.

17

Genetic Algorithms (GA)


GAs are based on abstractions of the mechanisms of evolution
in nature.
GAs create a population of individuals represented by
chromosomes which go through a process of simulated
evolution.
Arrays (fixed-length) of bits or characters represent the
chromosomes.
Bit manipulation operations allow the implementation of
crossover, mutation and other operations.
Possible application: multidimensional optimisation where the
chromosome encodes the different parameters being optimized.
One cycle of a GA is referred to as a generation.
18

Genetic Algorithms (GA)


GAs are based on abstractions of the mechanisms of evolution
in nature.
GAs create a population of individuals represented by
chromosomes which go through a process of simulated
evolution.
Arrays (fixed-length) of bits or characters represent the
chromosomes.
Bit manipulation operations allow the implementation of
crossover, mutation and other operations.
Possible application: multidimensional optimisation where the
chromosome encodes the different parameters being optimized.
One cycle of a GA is referred to as a generation.
Elitist option: The winner never dies.
19

Genetic Operators
Mutation
Recombination or crossover
Selection

20

Mutation

00101011000111010011

21

Mutation

00101011000111010011

22

Mutation

00101011010111010011

23

Mutation

00101011010111010011

24

Crossover

00101011000111010011
11001001101011111101

25

Crossover

00101011|000111010011
11001001|101011111101

26

Crossover

00101011|101011111101
11001001|000111010011

27

Crossover

00101011|000111010011
11001001|101011111101

28

Two Point Crossover

00101011|00011101|0011
11001001|10101111|1101

29

Two Point Crossover

00101011|10101111|0011
11001001|00011101|1101

30

Two Point Crossover

00101011|10101111|0011
11001001|00011101|1101

31

Two Point Crossover

00101011101011110011
11001001000111011101

32

How a Simple GA Works


1. t = 0; initialize a (random) population of individuals P(t);
evaluate fitness of all initial individuals of P(t)
2. LOOP WHILE (termination criterion(t, fitness, etc.))
(a) Select a sub-population P(t) for offspring production
(b) Recombination: Crossover chromosomes in P(t)
(c) Mutation: Mutate chromosomes in P(t)
(d) Fitness evaluation of P(t)
(e) Selection: Select survivors according to their fitness from
P(t) and P(t)
(f) t++ and P(t) survivors
END WHILE

33

Evolution of a Population

genotypic (coding) state space


genotype1

epigenesis

phenotype1

genotype2

mutation
genotype2'

representation

phenotype2
selection

phenotypic (behavioural) space


c
1994
Atmar

34

Modes of Learning in Natural Evolution


Three modes of learning Atmar (1994):
Phylogenetic Learning: adaptive behaviours are accrued within
the lifetime of a phyletic lineage. The reservoir that accumulates
phylogenetically learned behaviour is the species aggregate
germline; the least unit of change in this reservoir is a base pair.

35

Modes of Learning in Natural Evolution


Three modes of learning Atmar (1994):
Phylogenetic Learning: adaptive behaviours are accrued within
the lifetime of a phyletic lineage. The reservoir that accumulates
phylogenetically learned behaviour is the species aggregate
germline; the least unit of change in this reservoir is a base pair.
Sociogenetic Learning: adaptive behaviours are accumulated
within the lifetime of a group. The reservoir of learned behaviour
is social culture; the least unit of change is a shared experience.

36

Modes of Learning in Natural Evolution


Three modes of learning Atmar (1994):
Phylogenetic Learning: adaptive behaviours are accrued within
the lifetime of a phyletic lineage. The reservoir that accumulates
phylogenetically learned behaviour is the species aggregate
germline; the least unit of change in this reservoir is a base pair.
Sociogenetic Learning: adaptive behaviours are accumulated
within the lifetime of a group. The reservoir of learned behaviour
is social culture; the least unit of change is a shared experience.
Ontogenetic Learning: appropriate behaviours are learned
through trial and error during the lifetime of an individual. The
reservoir of learned behaviour is aggregate neuronal and
hormonal memory; the least unit of change is neurotransmitter
titer and/or receptor site sensitivity.
37

EC + ANN
evolution of connection weights (an alternative to standard
network training)
evolution of architectures (topology and activation functions)
evolution of learning rules (adaption of learning parameters)
The evolution can take place on three levels simultaneously, e.g.
the evolution of the learning rule can interact with the evolution of
the architecture.

38

EC + ANN Systems
EPNet (Yao and Liu, 1997)
ENZO (Braun and Zagorski, 1994)
NeuroEvolution of Augmenting Topologies (NEAT) (Stanley
and Miikkulainen, 2002)

39

Evolutionary Programming (EP)


Places emphasis on the behavioral linkage between parents and
their offspring.
For EP, like GAs, there is an underlying assumption that a
fitness landscape can be characterized in terms of variables, and
that there is an optimum solution (or multiple such optima).
Example: Find the shortest path in a Traveling Salesperson
Problem, i.e. each solution would be a path;
fitness = length of the path
fitness landscape = hypersurface proportional to the path
lengths in a space of possible paths
goal = globally shortest path in that space, or more
practically, to find very short tours very quickly
40

Three Steps of Basic EP


The basic EP method involves 3 steps (Repeat until a threshold for
iteration is exceeded or an adequate solution is obtained):
1. Choose an initial population of trial solutions at random.
2. Each solution is replicated into a new population. Each of these
offspring solutions are mutated according to a distribution of
mutation types.
3. Each offspring solution is assessed by computing its fitness.
Typically, a stochastic tournament is held to determine N
solutions to be retained for the population of solutions. There is
no requirement that the population size be held constant
(several offspring per parent possible).
EP typically does not use crossover.

41

Differences between GAs, EP.


The typical GA approach involves encoding the solutions as a
string of representative tokens, the genome. In EP, there is no
constraint on the representation. It follows from the problem.
In EP mutation changes aspects of the solution according to a
normal distribution (i.e. it weights minor variations in the
behavior of the offspring as highly probable and substantial
variations as increasingly unlikely). The severity of mutations is
often reduced as the global optimum is approached.

42

Evolution Strategies
ES were developed in experiments which used a wind tunnel to
find the optimal shapes of bodies in a flow.
First (1+1) ES with Gaussian distributed mutation was used.
(+) ES: = the population size, = number of offspring
generated per generation; incorporates recombination; mutation
scheme and the stepsize control are taken across unchanged
from (1+1)ES.
( + ) ES: The parental generation is taken into account
during selection.
(, ) ES: Only the offspring undergoes selection, and the
parents die off.

43

By choosing a certain ratio /, one can determine the


convergence property of the evolution strategy: If one wants a
fast, but local convergence, one should choose a small hard
selection ratio, e.g. 5/100, but looking for the global optimum,
one should favour a softer selection, e.g. 20/100.
Population size: The population has to be sufficiently large. The
genetic variety is necessary to prevent a species from becoming
poorer and poorer genetically and eventually dying out.
ES proved to be successful when compared on a large number
of test problems.
ES are adaptable to nearly all sorts of problems in optimization,
because they need very little information about the problem,
especially no derivatives of the objective function.

44

/* Evolutionary Hill Climbing (EHC) = (1+1)- ES */


Evaluate network with initial weights Wchamp = errorchamp
WHILE((errorgoal errorchamp ) and (counter nEpochs))
stepsize 0.01 Gaussian noise
FOR all weights of the neural net
Wmutant stepsize Gaussian noise
Wmutant Wchamp + Wmutant
end FOR
Evaluate network given by Wmutant = errormutant
IF(errormutant < errorchamp )
Wchamp Wmutant
errorchamp errormutant
end IF
counter counter + 1
end WHILE
RETURN errorchamp
45

Simulated Annealing
Simulated Annealing can be seen as a (1+1)ES with
time-dependent selection pressure but constant mutation rate
(Kirkpatrick et al., 1983).
The term annealing is used in metallurgy where metals or
glass are first heated and then gradually cooled to achieve a
hardening effect.
Idea: Bounce the solution just hard enough to escape local
minima while not to loosing momentum towards the global
minimum.

46

Differences between EP and ES


EP and ES share many similarities. When implemented to solve
real-valued function optimization problems, both typically
operate on the real values themselves (rather than any coding
of the real values as is often done in GAs).
The main differences between ES and EP are:
1. Selection: EP typically uses stochastic selection via a
tournament. In contrast, ES typically uses deterministic
selection.
2. Recombination: EP is an abstraction of evolution at the level
of reproductive populations (i.e., species) and thus no
recombination mechanisms are typically used. In contrast, ES
is an abstraction of evolution at the level of individual
behavior and many forms of recombination have been
implemented within ES.
47

Genetic Programming (GP)


GP is the extension of the genetic model of learning into the
space of programs.
Objects that constitute the population are programs that, when
executed, are the candidate solutions to the problem.
These programs are expressed in genetic programming as parse
trees, rather than as lines of code.
Crossover is implemented by taking randomly selected subtrees
in the individuals (selected according to fitness) and exchanging
them.
GP usually does not use mutation.

48

Applications of EAs
EAs can compute any computable function, i.e. everything a
normal digital computer can do.
EAs should be used when there is no other known problem
solving strategy.
ALife: Attempts to simulate the kind of behaviour exhibited by
real, living creatures, e.g.
Framsticks is a three-dimensional ALife project. The physical
structure of creatures and their control systems are evolved.
This system uses the standard EA framework to evolve 3D
agents equipped with neural networks.
Creatures

49

Biocomputing: Protein folding, RNA folding, sequence


alignment
See server of the European Bioinformatics Institute:
http://www.ebi.ac.uk/ebi_home.html ENCORE
Cellular Programming: Systems involving the actions of simple,
locally- interacting components, that give rise to coordinated
global behavior.
Game Playing: Evolution of a population of players.

50

Timetabling: of classes or exams in Universities, etc.


First developed in an Italian high school. The software
package is now in current use in some high schools in
Milano. (Colorni et al 1990)
Fitness function for a genome representing a timetable
involves computing degrees of punishment for various
problems with the timetable, such as clashes, instances of
students having to take consecutive exams, instances of
students having (eg) three or more exams in one day, the
degree to which heavily-subscribed exams occur late in the
timetable (which makes marking harder), overall length of
timetable, etc.
The modular nature of the fitness function has the key to the
strength of using GAs.
All constraints can be handled as weighted components of
the fitness function
51
Easy to adapt to a wide range of objectives

ANN training: See the project on Dynamic language processing


with recurrent NNs from last lecture.
Robotic walk optimization.

52

Multi-objective Optimisation
Standard EAs optimise a single objective function.
Sometimes simultaneaous optimisation of more than one
objective function is required.
Easy solution: Combine all objective functions into one by
using, e.g. a linear combination.
Alternative: Use vector of fitness values and trade off one
objective against others.
Pareto optimal front: Points that are not dominated by other
points and fitness of one objective results in a decrease in the
fitness associated with one or more other objectives.

53

EC is a Interdisciplinary Research Field


Computer scientists want to find out about the properties of
sub-symbolic information processing with EAs and about
learning, i.e. adaptive systems in general.
They also build the hardware necessary to enable future EAs
(precursors are already beginning to emerge) to huge real world
problems, i.e. the term massively parallel computation
[HILLIS92], springs to mind.
Engineers of many kinds want to exploit the capabilities of EAs
on many areas to solve their application, esp. optimisation
problems.
Roboticists want to build MOBOTs (MOBile ROBOTs, i.e.
R2D2s) that navigate through uncertain environments, without
using built-in maps.
Cognitive scientists might view some EAs as a possible
54
apparatus to describe models of thinking and cognitive systems.

Physicists Use EC hardware, e.g. Hillis (Thinking Machine


Corp.s) Connection Machine to model real world problems
which include thousands of variables, that run naturally in
parallel, and thus can be modelled more easily and esp. faster
on a parallel machine, than on a serial PC one.
Biologists Are finding EAs useful when it comes to protein
folding and other bio-computational problems.
EAs can also be used to model the behaviour of real
populations of organisms.
Chemists And in particular biochemists and molecular chemists,
are interested in problems such as the conformational analysis
of molecular clusters and related problems in molecular sciences.
Philosophers and some other really curious people may also be
interested in EC for various reasons.
55

Correspondences
Term
chromosome

biological meaning
string of DNA

gene

functional block of DNA which


encodes a particular protein

locus

position of a gene on a (pair of)


chromosome(s)
possible type of gene related to a
specific locus
all genes present in one complete
haploid set of chromosomes of a
species
particular set of genes of an
individual, i.e. the building plan of
the organism
result of interaction of a genotype
with the environment
way how individuals are chosen for
reproduction (e.g. survival of the
fittest)
genes are exchanged between each
pair of chromosomes to form a
gamete
single nucleotides are changed from
parents to offspring
interaction between genes in the expression of the genotype, e.g. one
gene can turn on and off other genes
(masking effect).

allele
genome

genotype

phenotype
selection

recombination
(crossover)
mutation
epistasis

use in evolutionary computation


candidate solution often encoded as
bit string
short block of adjacent bits that encode a particular element of the candidate solution

mathematical object
point in vector space
point in subspace

set of points in vector space


value of a gene

coordinate of a point

complete set of genes (and hence


chromosomes)

vector

configuration of bits in the chromosomes of a particular individual

point in vector space

the expressed traits of an individual

mapping

genotypes in the population are


selected for reproduction according
to their fitness
exchanging genes between two haploid single-chromosome haploid parents
random change of the allele

mapping

special genes are used as parameters


controlling other genes

mapping between vector


spaces (non-bijective).

56

Three Timescales Model


Timephase

IIa
IIb

III

Neurobiology
evolution of the brain,
co-evolution of brain
areas,
co-evolution
of language and the
brain
prenatal neural development
critical phases (postnatal), early learning,
child language acquisition
adaptive
learning
processes which continue after the critical
phases

Machine Learning
evolutionary algorithms generate and connect functional
modules, incremental learning

growing algorithms genetically


programmed
pruning and growing algorithms, Hebbian learning and
formation of cell assemblies,
structure incremental learning
ANN learning algorithms, reinforcement learning, data incremental learning
57

Summary
EC = GA + EP + ES + GP
Mutation, Recombination, Selection
Optimisation tasks
Many applications
Modular fitness function
The essential of natural evolution: selection ?
Combinations of ANN and EAs.

58

A Difficult Function for Optimisation

80
60
40
20
0
4
20
4

2
3

0
0

2
3

z = 20 + 2x2 + 2y 2 10(cos(2x) + cos(2y))


>>
>>
>>
>>
>>

x = -4:0.1:4;
y = -4:0.1:4;
[X,Y] = meshgrid(x,y);
z = 20 + 2*X.^2 + 2*Y.^2 -10*(cos(pi*2*X)+cos(pi*2*Y));
surf(x,y,z);

59

Critical Discussion
How close can we get to biologically plausible artificial
intelligence ?
Is the current Turing machine based concept of computing
appropriate for biologically realistic real-time information
processing ?

60

LITERATURE
Atmar, W. (1994). Notes on the simulation of evolution. IEEE
Transactions on Neural Networks, 5:130147.
Braun, H. and Zagorski, P. (1994). Enzo-m - a hybrid approach for
optimizing neural networks by evolution and learning. In Proceedings of the third Int. Conference on Parallel Problem Solving from
Nature, pages 440451. Springer-Verlag.
Chalup, S. and Maire, F. (1999). A study on hill climbing for neural
network training. In Proceedings of the 1999 Congress on Evolutionary Computation (CEC99), July 6-9, 1999, Mayflower Hotel,
Washington D.C., USA, volume 3, pages 20142021.
De Jong, K. A. (2006). Evolutionary Computation. A Unified Approach. The MIT Press, Cambridge, MA, USA.
Fogel, D. B. (2000). Evolutionary Computation: Towards a New
61

Philosophy of Machine Intelligence. IEEE Press, New York, 2nd


edition.
Fogel, L. J., Owens, A. J., and Walsh, M. J. (1966). Artificial
Intelligence through Simulated Evolution. Wiley, New York.
Goldberg, D. (1989). Genetic Algorithms in search, optimization
and machine learning. Addison-Wesley.
Holland, J. (1975). Adaption in Natural and Artificial Systems. MIT
Press.
Kirkpatrick, S., Gelatt, C., and Vecchi, M. (1983). Optimization by
simulated annealing. Science, 222:671680.
Kohl, N. and Stone, P. (2004). Machine learning for fast
quadrupedal locomotion. In The Nineteenth National Conference
on Artificial Intelligence, pages 611616.

62

Mitchell, M. (1996). An Introduction to Genetic Algorithms. MIT


Press.
Rechenberg, I. (1973). Evolutionsstrategie: Optimierung technischer
Systeme nach Prinzipien der biologischen Evolution. FrommannHolzboog Verlag, Stuttgart.
Russell, S. J. and Norvig, P. (2003). Artificial Intelligence: A Modern
Approach. Pearson Education, Inc., Upper Saddle River, NJ, second
edition.
Schwefel, H.-P. (1995). Evolution and Optimum Seeking. John
Wiley, Chichester, UK.
Stanley, K. O. and Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evolutionary Computation,
10(2):99127.
Yao, X. (1999). Evolving artificial neural networks. Proceedings of
the IEEE, 87(9):14231447.
63

Yao, X. and Liu, Y. (1997). A new evolutionary system for evolving


artificial neural networks. IEEE Transactions on Neural Networks,
8(3):694713.

64

You might also like