Professional Documents
Culture Documents
Mapper 1 ...
...
...
...
...
Driver
...
...
...
Mapper n
Generations Period 1 Generations Period 2
Generation 1 Generation 2
Figure 3: The workflow for the island model.
Figure 1: The workflow for the global model.
In Hadoop, the numbers of mappers and reducers are
More in details, the slave nodes in the cluster perform not strictly correlated, but we coupled them in order to
only the fitness evaluation operator. The mappers receive represent them as islands. We used the mappers to execute
the records in the form (individual, destination). The des- the generation periods and at the end of the map phase a
tination field is used only by the other models, so it will be function applies the migration criterion with which every
mentioned later. We deliberately disabled the reduce phase, individuals will have a specific destination island: This is
because there is no need of moving individuals between nodes. the time in which the second part of the output records is
After the map phase, the master reads back the individual employed. Then the partitioner can establish where to send
and continues with the other remaining genetic operators, individuals and the reducer is used only to write into the
considering the whole current population. HDFS the individuals received for its correspondent island.
...
...
...
Driver
Java programming language and postpone some details to
Mapper n Reducer n
Section 6 .
The OneMax problem can be formally described as find-
Generation 1 Generation 2
ing a string ~ x = {x1 , x2 , . . . , xN }, with xi {0, 1}, that
maximises the following equation:
Figure 2: The workflow for the grid model.
N
X
The Driver has the task of randomly generate a sequence of F (~
x) = xi (1)
i=1
neighbourhoods destinations for the individuals in the current
population. These destinations are stored into the record as To solve this problem with GAs, the individuals can be
represented as bit strings and the above equation can be used Both the Individual and FitnessValue objects are en-
as fitness function. We also need to define the genetic operat- capsulated into a wrapper class called IndividualWrapper.
ors. elephant56 allows the developer to define each of these Next, we need to implement the fitness function by extend-
elements by extending the classes of the framework. Because ing the FitnessEvaluation class:
there exists an underlying distributed platform (i.e., Hadoop
1 public class FitnessEvaluation<IndividualType extends
MapReduce), many of the objects are encapsulated into some , Individual, FitnessValueType extends
wrapper objects which ease the serialisation process. , FitnessValue> extends GeneticOperator<
In elephant56, an individual can be defined by extending , IndividualType, FitnessValueType> {
2
the class Individual, which requires at least the implement- 3 public FitnessValueType evaluate(IndividualWrapper<
ation of the following serialisation method: , IndividualType, FitnessValueType> wrapper);
4 }
1 public abstract class Individual implements Cloneable {
2 public abstract Object clone() throws
, CloneNotSupportedException; For OneMax, the fitness function (Equation 1) consists of
3 public abstract int hashCode(); simply counting the number of bit set to 1 in the bit string:
4 }
1 public class OneMaxFitnessEvaluation extends
For the OneMax example, a bit string chromosome can be , FitnessEvaluation<BitStringIndividual,
, IntegerFitnessValue> {
defined adapting a list of boolean elements: 2
3 @Override
1 public class BitStringIndividual extends Individual {
4 public IntegerFitnessValue evaluate(IndividualWrapper
2 private List<Boolean> bits;
, <BitStringIndividual, IntegerFitnessValue>
3
, wrapper) {
4 public BitStringIndividual(int size) {
5 BitStringIndividual individual = wrapper.
5 bits = new ArrayList<>(size);
, getIndividual();
6 }
6 int count = 0;
7
7 for (int i = 0; i < individual.size(); i++)
8 public void set(int index, boolean value) {
8 if (individual.get(i))
9 bits.set(index, value);
9 count++;
10 }
10 return new IntegerFitnessValue(count);
11
11 }
12 public boolean get(int index) {
12 }
13 return bits.get(index);
14 }
15 At this point, we need to define the genetic operators
16 public int size() { (i.e., crossover, mutation, selection). Given that standard
17 return bits.size();
18 }
operators are already provided by the framework, in the
19 following we will describe in details only the definition of
20 } those genetic operators that require a specific implementation
in order to manage the classes we have defined so far. The
The second important element to define is the fitness value, procedure to define a genetic operator is similar to the one
namely an object quantifying the result of the fitness function used for the fitness function, thus we will omit the code for
evaluation. This is possible by extending the FitnessValue the superclasses extended.
class, which also requires to be comparable: Before applying the genetic operators we need to create
1 public abstract class FitnessValue implements an initial population, this can be done by randomly creates
, Comparable<FitnessValue>, Cloneable { the individuals as follows:
2 public abstract int compareTo(FitnessValue other);
3 public abstract Object clone() throws 1 public class RandomBitStringInitialization extends
, CloneNotSupportedException; , Initialization<BitStringIndividual,
4 public abstract int hashCode(); , IntegerFitnessValue> {
5 } 2
3 private int individualSize;
For the example, the fitness value is an integer since we 4 private Random random;
need to store for each solution how many bits are set to 1: 5
6 public RandomBitStringInitilization(. . . , Properties
1 public class IntegerFitnessValue extends FitnessValue { , userProperties, . . . ) {
2 private int number; 7
3 8 individualSize = userProperties.getInt(
4 public IntegerFitnessValue(int value) { , INDIVIDUAL_SIZE_PROPERTY);
5 number = value; 9 random = new Random();
6 } 10 }
7 11
8 public int get() { 12 @Override
9 return number; 13 public IndividualWrapper<BitStringIndividual,
10 } , IntegerFitnessValue> generateNextIndividual(
11 , int id) {
12 @Override 14 BitStringIndividual individual = new
13 public int compareTo(FitnessValue other) { , BitStringIndividual(individualSize);
14 if (other == null) 15
15 return 1; 16 for (int i = 0; i < individualSize; i++)
16 Integer otherInteger = ((IntegerFitnessValue) other 17 individual.set(i, random.nextInt(2) == 1);
, ).get(); 18
17 Integer.compare(number, otherInteger); 19 return new IndividualWrapper(individual);
18 } 20 }
19 21 }
20 }
It is worth noting that some properties can be distributed 7 mutationProbability = userProperties.getDouble(
through the Properties object, which is filled on the master , MUTATION_PROBABILITY_PROPERTY);
8 random = new Random();
node and available from the constructor methods of the 9 }
genetic operators when executed in parallel. In the example, 10
it has been used to read the size of the bit strings. 11 @Override
12 public IndividualWrapper<BitStringIndividual,
After the fitness evaluation has happened, elitism and , IntegerFitnessValue> mutate(IndividualWrapper<
parents selection follow. We chose to define them by us- , BitStringIndividual, IntegerFitnessValue>
ing the BestIndividualsElitism and RouletteWheelPar- , wrapper) {
13 BitStringIndividual individual = wrapper.
entsSelection classes, already provided by the framework. , getIndividual();
Of course, the developer may choose to define other elitism 14
and/or parent selection strategies by extending the classes 15 for (int i = 0; i < individual.size(); i++)
16 if (random.nextDouble() <= mutationProbability)
Elitism and ParentsSelection, respectively.
17 individual.set(i, !individual.get(i));
The crossover operator needs a specific implementation 18
to manage bit string splits. A single point crossover can be 19 return wrapper;
defined as follows: 20 }
21 }
1 public class BitStringSinglePointCrossover extends
, Crossover<BitStringIndividual, This mutation operator, as defined above, mutates each
, IntegerFitnessValue> { gene according to a mutation probability that is distributed
2
3 private Random random;
as a property value.
4 The survival selection is the last operator to be applied
5 public BitStringSinglePointCrossover(. . . ) { to produce the next offspring, in our example we used the
6 Roulette Wheel Selection already implemented by the Roul-
7 random = new Random();
8 } etteWheelSurvivalSelection class provided in the frame-
9 work. Of course, the developer may choose to define his/her
10 @Override own survival selection strategy by extending the class Sur-
11 public List<IndividualWrapper<BitStringIndividual,
, IntegerFitnessValue>> cross(IndividualWrapper< vivalSelection.
, BitStringIndividual, IntegerFitnessValue> Finally, we need to register with the Driver all the classes
, wrapper1, IndividualWrapper< we defined as follows:
, BitStringIndividual, IntegerFitnessValue>
, wrapper2, . . . ) { 1 public class App {
12 BitStringIndividual parent1 = wrapper1. 2 public static void main(String[] args) {
, getIndividual(); 3
13 BitStringIndividual parent2 = wrapper2. 4 driver.setIndividualClass(BitStringIndividual.class
, getIndividual(); , );
14 5 driver.setFitnessValueClass(IntegerFitnessValue.
15 cutPoint = random.nextInt(parent1.size()); , class);
16 6
17 BitStringIndividual child1 = new 7 driver.setInitializationClass(
, BitStringIndividual(parent1.size()); , RandomBitStringInitialization.class);
18 BitStringIndividual child2 = new 8 driver.setInitializationPopulationSize(
, BitStringIndividual(parent1.size()); , POPULATION_SIZE);
19 9 userProperties.setInt(INDIVIDUAL_SIZE_PROPERTY,
20 for (int i = 0; i < cutPoint; i++) { , INDIVIDUAL_SIZE);
21 child1.set(i, parent1.get(i)); 10
22 child2.set(i, parent2.get(i)); 11 driver.setElitismClass(BestIndividualsElitism.class
23 } , );
24 12 driver.activateElitism(true);
25 for (int i = cutPoint; i < parent1.size(); i++) { 13 userProperties.setInt(NUMBER_OF_ELITISTS_PROPERTY,
26 child1.set(i, parent2.get(i)); , NUMBER_OF_ELITISTS);
27 child2.set(i, parent1.get(i)); 14
28 } 15 driver.setParentsSelectionClass(
29 , RouletteWheelParentsSelection.class);
30 List<IndividualWrapper<BitStringIndividual, 16
, IntegerFitnessValue>> children = new 17 driver.setCrossoverClass(
, ArrayList<>(2); , BitStringSinglePointCrossover.class);
31 18
32 children.add(new IndividualWrapper<>(child1)); 19 driver.setSurvivalSelectionClass(
33 children.add(new IndividualWrapper<>(child2)); , RouletteWheelSurvivalSelection.class);
34 20 driver.activateSurvivalSelection(true);
35 return children; 21
36 } 22 driver.setUserProperties(userProperties);
37 } 23
24 driver.run();
The function cross selects a random cut point and builds 25
26 }
two new children by mixing the chromosomes of the parents. 27 }
Thereafter, a random mutation function is implemented:
1 public class BitStringMutation extends Mutation<
The selection between sequential and parallel models is
, BitStringIndividual, IntegerFitnessValue> { possible by specifying one of the Driver class specialisations.
2 Assuming that a Hadoop MapReduce cluster is already
3 private Random random; set up, we can pack the code in a single JAR file, including
4
5 public BitStringMutation(. . . ) { the elephant56 dependency, and execute it with the standard
6 Hadoop launch method.
6. ARCHITECTURE specific node. As mentioned in Section 3, the serialisation
In this section we describe the architecture of elephant56. is made by using Avro, thus each split corresponds to a
We conceptually divided it into two levels of abstraction: single binary file stored into the HDFS. The information
about the split are stored using the PopulationInputSplit
1. the core level, which manages the communication with class and read with the PopulationRecordReader class which
the underlying Hadoop MapReduce platform; deserialises the Avro objects into Java objects for the slave
2. the user level, which allows the developer to interface nodes.
with the framework.
6.1.3 core.output
Whereas the core level is immutable for the user, it is The core.output allows serialising the Java objects pro-
through the user level that the developers can extend its duced during the generations and storing them into the
functionalities and develop their own GAs. HDFS. This is done by using the classes NodesOutputFormat
and PopulationRecordWriter.
6.1 Core Level
The core level is responsible of dealing with the Hadoop 6.1.4 core.generator
MapReduce platform. This is made by extending the MapRe- The core.generator package (Figure 6) is composed of
duce classes provided with the Hadoop library. Figure 4 the GenerationsPeriodExecutor class that implements the
shows the structure of the core package, which contains five SGA described in Section 4.1. The specialisation of this
subpackages, each responsible of a different functionality. class allows the distribution of GAs according to the different
parallel models (see Section 4).
core
core.generator
common input generator reporter
GenerationsPeriodExecutor
output
Global GridPrimary GridSecondary Island