You are on page 1of 35

Le Havre University

Faculty of Sciences and Technology


Laboratory of Computer Sciences




Practical training report
from the 8
th
of March 2004 to the 10
th
of June 2004




DEDIS
Dynamic distribution of an ecosystem model











Date: 09.06.2004
Version: 0.6 (16:00)
Author: Robert FISCH
Tutor: Cyrille BERTELLE
Frdric GUINAND
Damien OLIVIER


PructicuI Truininq Report

Page 2 of 35

0. PREFACE

0.1. Acknowledgments
I want to thank all members of the laboratory who I worked with, particularly:

M. Cyrille BERTELLE, headmaster of the DEA-ITA and one of my tutors all along this
time.

M. Damien OLIVIER, my tutor and mentor, who helped me with every problem or
trouble I encountered.

M. Frdric GUINAND, who encouraged me to continue my studies at the University
of Le Havre.

M. Antoine DUTOT, PhD student and author of AntCO
2
, with whom I worked with
during the entire project and who was always ready to tackle new problems.

M. Guillaume PREVOST and M. Sylvain LEREBOURG, PhD students, who always
answered my questions about the ProActive API in a constructive way.

Mme Emna BOUAZIZI, M. Majed ABDOULI, M. Jrme HAUBERT, M. Denis
MERON, M. Samy SAMGHOUNI, M. Pierrick TRANOUEZ and everybody else who
was very patient with me.


Special thanks go also to:

My classmates, Mme. Mlanie DERRE, Mme. Noemy PICARD, M. Mahmoud
ABIDER, M. Mokrane BOUARABA, M. Thomas DE CONTES, M. Jean-Claude DE
SOUZA, M. Frederic DUCHAUSSOY, M. Mathieu GALLET, M. Jean Baptiste
GASHUMBA, M. Anis HAJ SAID, M. Nizar IDOUDI, M. Mohamedou OULD BIBI, M.
Yoann PIGNE, M. Gauthier PITOIS, M. Mathieu PRIGENT, M. Frank SANNIER

Romy FISCH, my sister, and Chantal WEIS, my wife, who helped to revise this document.



PructicuI Truininq Report

Page 3 of 35

0.2. Abstracts
0.2.1. Abstract
Entity-based simulations are basically very synchronous and need a lot of
calculation power. This document describes an entity model for large-scale
distributable simulations. It tries to break down the synchronism in order to
make it possible to process different tasks independently on distinct nodes of the
network. Additionally, a related architecture will be presented. With the growing
number of entities, problems of overcharged CPU's or heavily loaded network
connections appear. In o rder to set up some kind of load-balancing between the
distinct nodes, an intelligent ant-based algorithm will be used.


0.2.2. Rsum
Les simulations centr individus sont souvent trs synchrone et ont besoin d'une
puissance de calcul formidable. Ce document dcrit un modle d'entits pour de
grandes simulations distribues qui essaye de casser le dit synchronisme en vue
de pouvoir faire tourner indpendamment diffrentes tches sur des noeuds
distinct du rseau. Une architecture adquate sera prsente. Lorsque le nombre
d'entits augmente, des problmes de surcharges des CPUs ou d'embouteillages
de connexion rseau peuvent surgir. En vue d'une quilibration des charges
entre les noeuds, un algorithme de fourmis intelligent sera utilis.


0.2.3. Kurzfassung
Individuum basierte Simulationen sind oft sehr synchron und bentigen sehr viel
Rechenkraft. Dieses Dokument beschreibt ein individuum-basiertes Modell fr
groe verteile Simulationen. Dieses versucht hinsichtlich einer unabhngigen
Ausfhrung verschiedener Prozesse auf den einzelnen Knoten des Netzwerkes
diesen Synchronismus zu brechen. In diesem Zusammenhang wird auch eine
dementsprechende Architektur prsentiert. Mit steigender Anzahl an simulierten
Einheiten, knnen Rechnerberlastungen oder Transmissionsengpsse von
Netzwerkverbindungen auftreten. Diesbezglich wird ein intelligenter
Ameisenalgorithmus eingesetzt um einen Lastenausgleich zwischen den
einzelnen Knoten zu schaffen.



PructicuI Truininq Report

Page 4 of 35


TobIe of confenfs

0. PREFACE.....................................................................................2
0.1. Acknowledgments.............................................................................2
0.2. Abstracts .........................................................................................3
0.2.1. Abstract .............................................................................................. 3
0.2.2. Rsum............................................................................................... 3
0.2.3. Kurzfassung......................................................................................... 3
1. INTRODUCTION .........................................................................7
1.1. LIH .................................................................................................7
1.1.1. Presentation ........................................................................................ 7
1.1.2. Members ............................................................................................. 7
1.1.2.1. Professors ................................................................................................. 7
1.1.2.2. Lecturers................................................................................................... 7
1.1.2.3. PhD Students............................................................................................. 8
1.1.3. Location .............................................................................................. 8
1.2. Related projects ...............................................................................8
2. DESCRIPTION ............................................................................9
2.1. Context ...........................................................................................9
2.2. Objective ....................................................................................... 10
2.3. Requirements................................................................................. 10
3. SIMULATION MODEL ................................................................11
3.1. Environment .................................................................................. 11
3.2. Entities.......................................................................................... 12
3.3. Simulation architecture.................................................................... 12
3.3.1. The master ........................................................................................ 13
3.3.2. The slave........................................................................................... 13
3.4. Environment distribution.................................................................. 14
3.4.1. Static distribution ............................................................................... 14
3.4.2. Dynamic distribution........................................................................... 15
3.4.3. No distribution ................................................................................... 15
3.5. Simulation model ............................................................................ 15
3.5.1. Entity transactions.............................................................................. 16
3.5.2. Entity migration ................................................................................. 16
3.6. Interactions ................................................................................... 17
3.6.1. Push ................................................................................................. 17
3.6.2. Pull ................................................................................................... 18
3.6.2.1. Cell by cell............................................................................................... 18
3.6.2.2. By rectangular zone.................................................................................. 18

PructicuI Truininq Report

Page 5 of 35

3.6.2.3. By circular zone with arc ........................................................................... 19
3.7. Behaviour ...................................................................................... 20
4. INTERFACE TO AntCO
2
..............................................................21
4.1. Interaction..................................................................................... 21
4.2. Communication .............................................................................. 21
4.2.1. General concept ................................................................................. 21
4.2.2. Scenario............................................................................................ 23
4.2.3. Mutual blocking problem...................................................................... 24
4.2. Interface........................................................................................ 25
5. SIMULATION RESULTS .............................................................26
5.1. Scalability...................................................................................... 26
5.2. Entities.......................................................................................... 28
5.2.1. Grouping ........................................................................................... 28
5.2.2. Ungrouping........................................................................................ 28
5.2.3. Obstacles .......................................................................................... 29
5.2.4. Inactive entities ................................................................................. 30
5.3. Communication graphs.................................................................... 30
6. CONCLUSION............................................................................33
6.1. The project .................................................................................... 33
6.2. General conclusion.......................................................................... 33
7. REFERENCES & ACRONYMS ......................................................34
7.1. Acronyms ...................................................................................... 34
7.2. Links ............................................................................................. 34
7.3. References..................................................................................... 35


PructicuI Truininq Report

Page 6 of 35


TobIe of fiqures

Figure 1: Space division.......................................................................................11
Figure 2: Environment "torus" representation.........................................................12
Figure 3: Communication scheme .........................................................................14
Figure 4: Simulation cycle....................................................................................16
Figure 5: Entity transaction scheme ......................................................................16
Figure 6: Simulation cycle....................................................................................16
Figure 7: Simulation cycle with migration...............................................................17
Figure 8: Circular view scheme.............................................................................19
Figure 9: View by circular zone with arc (discrete values) ........................................19
Figure 10: Communication graph with AntCO
2
........................................................21
Figure 11: Communication scheme between AntCO
2
and DEDIS ...............................22
Figure 12: Synchronization by ProAvtive................................................................24
Figure 13: Interface to AntCO
2
-Master part ............................................................25
Figure 14: Interface to AntCO
2
-Slave part ..............................................................25
Figure 15: Simulation time response versus number of entities per machine ..............27
Figure 16: Emerging groups .................................................................................28
Figure 17: Group splitting ....................................................................................29
Figure 18: Entities around an obstacle...................................................................29
Figure 19: DEDIS post-simulation visualization.......................................................31
Figure 20: AntCO
2
graph visualizer........................................................................31
Figure 21: AntCO
2
analyze ...................................................................................32



PructicuI Truininq Report

Page 7 of 35

1. INTRODUCTION

This document describes the different experiences, reflections and studies of my
practical training at the computer laboratory between the 8
th
of March 2004 and
the 10
th
of June 2004. It is primarily dedicated to my project, a distributed
entity-based simulator.


1.1. LIH
1.1.1. Presentation
Under the leadership op Jol COLLOC, the Laboratory of Informatics of the
University of Le Havre counts today 2 PU, 2 HDR, 18 MCF and 13 PhD
students.

It main research is done in the following categories:

Evolutionary systems on life and environment
Distributed artificial intelligence - Mutli-agent systems
Real-time database management systems

1.1.2. Members
1.1.2.1. Professors

Alain Cardon
Jol Colloc

1.1.2.2. Lecturers



Thierry Galinho Da Silva
Frdric Guinand
Vronique Jay
Bruno Mermet
Moustapha Nakechbandi
Damien Olivier
Patrick Person
Jean-Luc Ponty
Bruno Sadeg
Frdric Serin
Gale Simon

Laurent Amanton
Dominique Archambault
Mustapha Arfi
Stefan Balev
Cyrille Bertelle
Hadhoum Boukachour
Michel Coletta
Jean-Yves Colin
Marianne Deboysson - Flouret
Claude Duvallet
Dominique Fournier

PructicuI Truininq Report

Page 8 of 35

1.1.2.3. PhD Students


1.1.3. Location
Laboratoire d'Informatique du Havre (EA3219)
25 rue Philippe Lebon,
BP 540
F-76058 Le Havre cedex

Tlphone: +33 (0) 2 3274 4373
Fax: +33 (0) 2 3274 4314

1.2. Related projects
One of the pillars of the research at the LIH is about life-based information
technology models
1
for developing distributed or parallel applications. One of the
main goals is to understand and explain the way certain natural complex
systems work and to reproduce their behaviour in digital simulations. Secondly,
it aims at setting up new conceptual models inspired by models taken from the
real world and their specific mechanisms. As a result, it tries to define new
approaches which are more appropriated for distributed and parallel systems.

The different projects I want to mention are:

Antoine DUTOT
Ant Algorithms for Adaptative Dynamic Distribution
http://www-lih.univ-lehavre.fr/~dutot/Ess2003/index.html

Guillaume PREVOST
Individual-based simulations in estuarial environments
http://www-lih.univ-lehavre.fr/~prevost/sujet.html

Sylvain LEREBOURG
Decentralized models for organisational stream simulations in complex
environments
http://www-lih.univ-lehavre.fr/~lerebourg/

1
http://www-lih.univ-lehavre.fr/Recherches/Themes/miv.html
Sylvain Lerebourg
Denis Meron
Guillaume Prevost
Aurlia Rabia
Samy Semghouni
Pierrick Tranouez
Majed Abdouli
Emna Bouazizi
Roland Coma
Xavier Denis
Antoine Dutot
Jrme Haubert


PructicuI Truininq Report

Page 9 of 35

2. DESCRIPTION

2.1. Context
Due to the variety and diversity of organisms that live in an aquatic ecosystem,
the relative characteristics of each species, and the properties of their
environment, it is very hard to predict what kind of influence any change to the
system may have. For obvious reasons such tests cannot be performed on real
objects. On the one hand, they would be too time-consuming because the
evolution of the global system is very slow; on the other hand, wrong tests may
be harmful and injurious. Thus the need to do these tests in a virtual
environment, using virtual entities, arises. Here failed tests on making changes
to the system do not result in a catastrophe. The main objective is to end up
with a computer simulation of an aquatic ecosystem.

As already mentioned above, such an ecosystem is constituted of a huge number
of interacting entities. This is a complex system because the whole is more than
the sum of its pieces, meaning in this case that the overall behaviour of the
system cannot be explained by the absolute knowledge of each kind of
participating entities. As a matter of fact the behaviour of a single entity
depends on its nearby environment. Thus, every entity is influenced by other
entities which it may influence in return.

In order to obtain significant results, the number of necessary entities can easily
exceed one hundred thousands. This will be too much for a simulation running
on a single processor system. To overcome this problem, distributed simulation
systems should be considered. Spreading the entities over many machines could
speed up the entire simulation considerably.

But an ecosystem is something alive, which means that some entities die and
others will be born. For the simulation, this causes the number of entities which
reside on a given machine not to be stable but constantly changing. Hence there
is a need to perform some kind of dynamic load-balancing between each
participating computer.

Generally speaking, entities which are close to each other exchange more data
than distant ones. This data exchange can also be seen as a type of
communication. They form what is also called a "heavy communicating cluster".
The appearance of such groups and their related movements has not been
programmed explicitly. Their emergence is due to the individual behaviour their
constituting entities.

By extracting the necessary data from the simulation, a communication graph
can be established. The mentioned load-balancing is based upon the latter.

PructicuI Truininq Report

Page 10 of 35

2.2. Objective
DEDIS is the acronym of "Distributed Environment for Dynamic Individual-based
Simulations". The main goal of the DEDIS project is to find a way to dynamically
distribute an entity-based simulation among multiple computers.

One of the major problems of such models is the huge number of entities which
have to participate and communicate with each other. While adding more and
more entities to the simulation and making it run over longer periods of time,
this can easily become very sticky. A possible solution would be to distribute the
entities dynamically among different computers.

Instantly the question about which computer will simulate which part of the
simulation with which entities will arise. Hence the distribution has to take into
account not only the number of entities which reside on a given node, but also
their need for communication with others and the node's calculation power. This
explains why the simulation has to monitor parameters like the communication
variation of each entity or of groups of entities, their number and the average
load of their node.


2.3. Requirements
#1 All code has to be written in Java.

#2 The communication part should be based on the ProActive API.
ProActive is a Java library for parallel, distributed, and concurrent
computing, also featuring mobility and security in a uniform framework.
With a reduced set of simple primitives, ProActive provides a
comprehensive API allowing to simplify the programming of applications
that are distributed on Local Area Networks, on workstation clusters, or
on Internet Grids.
2




2
http://www-sop.inria.fr/oasis/ProActive


PructicuI Truininq Report

Page 11 of 35

3. SIMULATION MODEL

In this section I will describe the model I conceived in order to result in a
dynamic, distributable and large-scale simulation.


3.1. Environment
First of all, I fixed ideas about the environment entities will live in. In order to
keep things clear and simple, I want to study the case where the space is
reduced to a fixed sized 2D ground, divided into different cells.



Figure 1: Space division

This grid does not influence the way entities are executing. It only helps them to
speed up the search for neighbours. As mentioned by [REY01], a straightforward
implementation of a neighbour search algorithm has a complexity of O(n
2
),
because of the fact that a given entity has to query all remaining entities for
their position, and only a suitable spatial data structure allows to reduce this
cost to nearly O(n). As a result, the proposed space division merely serves to
locate entities very quickly.


PructicuI Truininq Report

Page 12 of 35

Two types of environments may be considered:

Closed environment: This means that the environment is closed at every
side, e.g. if an entity reaches the left border it cannot move further on and
is blocked by the environment border.

Open environment: In this case the borders are not closed but open, which
means that an entity leaving the environment at the left side will re-enter
the environment at the right side. As a matter of fact the environment
may be seen as a "torus":


Figure 2: Environment "torus" representation

3.2. Entities
Every piece of the simulation needing special processing or interactions is called
an entity. Each of them has individual needs and reacts or evolves in a different
way. Every entity has its own life; it is responsible for its behaviour. It may be
influenced by nearby entities but the latter must not take a decision in its stead.

Each entity is tied to a cell on the environment. As mentioned above, the
partitioning of the space is only used to speed up neighbour research. In fact
every entity may "look around" to find nearby entities. Depending on its
neighbourhood it might take a certain decision, as for example to move away
from its actual position or to send out a message to other entities.


3.3. Simulation architecture
Because such a simulation needs a minimum of synchronization, I made the
decision to use master-slave scheme with a global clock and a kind of entity
directory.

In the next few paragraphs I will try to give a more detailed description of the
roles of each part. I will also discuss its advantages and disadvantages and point
out where I have encountered problems.

PructicuI Truininq Report

Page 13 of 35

3.3.1. The master
The master's first role is to manage all its slaves. This comprises the
knowledge about the registered clients and the ability to dispatch incoming
messages among them.

As already mentioned, the master includes a global clock, on which the
slaves are synchronized. This implies that only the master can initiate,
pause or stop a simulation.

Furthermore this part is responsible for maintaining a lookup table for the
slave-entity relation. This allows a slave, or the master itself, to quickly find
out on what other slave a given entity is located or reversely which entities
are positioned on a given slave.


3.3.2. The slave
The slaves are the big working parts of the simulation. They are controlled
by the master and have to register with it before the simulation can begin.
Each of the slaves holds a certain number of entities and is responsible for
their execution. Each slave has to keep updated a repository of the entities
it owns.

Moreover, it has to offer different services to its own entities as well as to
other slaves. As a matter of fact, an entity may want to communicate with
another one, but as it does not have the knowledge about the exact location
of the entity, it must have the possibility to delegate this job to the slave it
is located on. The latter must offer a kind of local communication service,
which will guarantee that messages and queries passed across the network
will arrive at their destination. This same service will also allow inter-slave-
communication.

For performance reasons, I decided not to attribute a thread to each entity,
but rather to let each slave make its own schedule. This makes the slave
machines a lot more reliable and decreases their response time.

While working on a prototype slave, I encountered blocking problems, which
were related to ProActive's request-management. In fact, after posting a
remote request, active objects [PRO04, p. 8] block until the answer has
been returned. This means that when two slaves are sending out a request
to each other, each of them is waiting for the other's response. Of course,
this blocks the entire simulation, because the master waits for the slaves'
result. Thus, in order to exclude mutual blocking of slaves, I composed each
slave of two different threads: the first one is responsible for executing
incoming requests from other slaves or from the master and the second
schedules the execution of its entities, only making outgoing requests.

PructicuI Truininq Report

Page 14 of 35


Master Slave1 Slave2
execute 1
execute 2
launch
launch
a new thread inside the slave
slave-slave communication
slave-master communication

Figure 3: Communication scheme


3.4. Environment distribution
When thinking about distributing the simulation, one must also consider the
environment, which is common to all entities, can be distributed and shared
among all participating machines. Different scenarios may be considered:

3.4.1. Static distribution
With a static environment distribution, every client will statically receive a
certain part of the environment. This implies that the entire environment is
not really shared, but divided and then spread among the slaves. Regarding
the objectives of the project, this method presents two main disadvantages:

The number of client machines is static and cannot change during the
execution.

Fixing the environment distribution will also fix the location of the
entities, which means that the distribution of the entities will mainly
depend on their position and not on their communication volume with
other.

These disadvantages are opposed to the project's objectives of dynamic
entity distribution; this method is therefore rejected.


PructicuI Truininq Report

Page 15 of 35

3.4.2. Dynamic distribution
Distributing the environment dynamically would mean that the cells would
be distributed among the clients depending on the client load. Again, this
approach does not respect the fact that entities will communicate more or
less.

Furthermore, determining dynamically what cell to position on what machine
introduces on one hand an overhead in calculation time and on the other
hand implies that cells additionally must be able to migrate from one
machine to another; this also means more communication. Given the scale
the simulation may reach, this method will not fit our needs either and can
be excluded.

3.4.3. No distribution
As the environment only serves to determine quickly what entity resides on
a given cell, I think there is no real need to distribute the environment. As
solution, I propose that every client machine should possess its own entire
environment. The environment can be seen as a simple lookup table which
maintains for a given pair(x,y) of coordinates a list of entities. This is quite
similar to what [MOR96] describes in his paper about large-scale distributed
simulations.

When a given entity wants to access another one, whether local or remote,
it has to delegate its local communication service. This is similar but not as
expensive as the model [DIE98] conceives. In his model an entity on a
given machine communicates with a remote entity via a proxy object,
representing the distant entity.


3.5. Simulation model
Having clarified things this far, I now want to take a closer look at the simulation
itself. Limited by the fact that it has to be dynamic, distributable and large-scale,
the following questions have to be answered:

In general, entity simulations are very synchronous. How can one break
this synchronism without loosing data integrity?

Because entities may die or be born, the scenario's characteristics are
dynamic. Thus intelligent distribution is difficult [BRU98] because on the
one hand the available computational power of each machine is difficult to
quantify and on the other hand network behaviour may be very chaotic. So
the major question is how can entities be distributed equally among all
participating computers?

PructicuI Truininq Report

Page 16 of 35


execution synchronisation synchronisation commit

Figure 4: Simulation cycle

3.5.1. Entity transactions
During the simulation, the execution of a given entity may depend on the
values of some attributes of other entities. Thus the major problem is to
make sure that the state of the queried entities does not change.
Unfortunately simulation is placed in a distributed context, so one possible
solution is to give every entity two states: an old one, the "read" state, and
the actual one, the "write" state. In order to implement this, a commit
operation, to copy the values from the write state to the read state, has to
be introduced.


variable ABC_read
variable ABC_write
read
write
commit
copy value

Figure 5: Entity transaction scheme

This technique partially allows the breaking of the simulation's synchronism.
In fact, when an entity executes and needs to access other entities, it is
only allowed to obtain data from their read state. Nobody can modify the
write state except the entity itself. Thus an entire simulation cycle will be
composed as shown on the following figure:


execution synchronisation synchronisation commit

Figure 6: Simulation cycle

3.5.2. Entity migration
To solve the problem with the changing number of entities and the load of
the client machines, a dynamic distribution of the entities must be set up.
This implies that an entity must be able to migrate from one slave to
another one.


PructicuI Truininq Report

Page 17 of 35

The state of a given entity is only determined by the values of its attributes.
The simplest and easiest way to achieve this is to capture the entity's
relevant attributes and to transfer these to the destination machine before
removing the entity. Once arrived, the destination slave will instantiate a
class of the same type as the entity and copies the values of the attributes
to it.

In addition to the execution and commit cycles of the simulation, a
migration phase has to be introduced. During this phase entities have to
decide whether to migrate to another machine.



execution synchronisation synchronisation synchronisation commit migration

Figure 7: Simulation cycle with migration


Finally, the question of how to decide when and where to migrate arises. As
a matter of fact, this decision depends on various parameters: the average
load of the machine which an entity resides on, the total number of entities
on this machine or the amount of communication the entity has do make
with others.

This is where the DEDIS project has to be interfaced with the AntCO
2

project, which offers load-balancing suggestion services. Further details
about the connection to AntCO
2
will be discussed in the next chapter.


3.6. Interactions
Basically there exist two different possibilities for entities to communicate with
other ones: either an entity "pushes" information towards or "pulls" it from
another one.

3.6.1. Push
A push interaction is characterized by the fact that a given entity "pushes"
information towards another entity. The simplest way to make this work is by
using a kind of messaging system.

Pushing information from one entity to another one is not a very flexible
method and does not really reflect the emitter entity's independence.
Nevertheless, this method presents some advantages, mainly when a precise
entity should be contacted or an asynchronous action is needed.


PructicuI Truininq Report

Page 18 of 35

It is also important to notice that during the push method, the emitter has to
communicate actively with its neighbours. As a matter of fact, it needs two
actions: one from the emitter entity, which has to send data to the receiver,
and one from the receiver entity, which has to look up the newly arrived
message and process it.

3.6.2. Pull
Pulling information from another entity means reading its state, or at least a
part of it. During this action, only the receiver entity needs to participate
actively in data exchange. As a matter of fact, the emitter entity is not
required to perform any operation.

The main disadvantage of this method is that the receiver, the entity that
wants to query another one, does not have direct references to its
neighbours, nor does it know their. So it seems to be evident that, in order to
acquire references to the entities which surround it, the receiver entity has to
post a query to every other machine, asking for the content of all the cells in
its neighbourhood.

Different query strategies have been considered:

3.6.2.1. Cell by cell
This is the first and simplest strategy. The receiver entity defines a group
of cells whose content it wants to get hold of. It then loops trough this list
and makes for each entry it its local communication service query for each
entry all other participating slaves.

This method is really slow because a new query has to be emitted for each
selected cell. It can be improved by sending a pair of coordinates to each
slave and retrieve a single result set and, more importantly, by emitting
the definition of a group of cells. Thus the entire result is obtained in one
go.


3.6.2.2. By rectangular zone
With this technique, the requesting entity queries its neighbouring cells
only once by transmitting the definition of a rectangular zone of the field.
Hence it transmits coordinates of the top left point as well as the width and
the height of the zone. It then receives result sets from each participating
slave and merges them into one array. The latter includes all relevant
information.

This method speeds up the communication part because only one request
is made and only one, more or less big, result set is retrieved. As a matter
of fact, a lot of small queries take much more time than a big one.

PructicuI Truininq Report

Page 19 of 35

3.6.2.3. By circular zone with arc
This strategy is based on the fact that in real life an entity normally is not
able to capture everything that happens around it. As described by
[JLP04], the view of an entity is defined by a radius and an angle.

angle
distance

Figure 8: Circular view scheme

The query method is the same as the one for the method described above,
hence its efficiency is roughly the same as well. Of course calculations
have to be mapped onto discrete values; as can be seen on the figure
above, the covered zone is somewhat angular.


Figure 9: View by circular zone with arc (discrete values)



PructicuI Truininq Report

Page 20 of 35

3.7. Behaviour
How do entities live and how do they act? The following few lines will try to give
a quick overview of the entity behaviour model I set up.

As explained in the previous chapter, each entity can receive messages from
others, so the first thing that an entity does during its cycle is to read out these
messages and process them. Depending on the content of the message the
entity may decide to initiate an action, as for example when it receives a "KILL"-
message, it dies.

Its next step is to scan its environment for nearby entities. The nature as well as
the state of a neighbour determines whether a neighbour attracts or repulses the
entity. The calculation of the final direction and speed of the entity is based on
vector-calculations as described by [REY01]. Of course I had to tweak attraction
and repulsion coefficients according the considered entity. For example, a prey is
repulsed more strongly by a predator than by an obstacle.

Another task of an entity, for example the predator entities, is to kill other ones.
For this special example, a message is sent to the concerning entity. Other
scenarios can be considered, as for example a moving entity, which informs its
neighbours where it will move next or what are its goals or needs.




PructicuI Truininq Report

Page 21 of 35

4. INTERFACE TO AntCO
2


This part describes how DEDIS and the AntCO
2
project of Antoine DUTOT
3

interact. It can be seen as the glue that binds together our projects.


4.1. Interaction
As said in [BDG03], AntCO
2
aims to provide some kind of load-balancing for
dynamic communication-based graphs. Thus the idea arises to couple DEDIS
with AntCO
2
, in such in way that both applications remain independant up to a
certain degree. Hence DEDIS will host the main simulation whereas AntCO
2
's
goal is to provide a dynamic suggestion on how and where to migrate entities. In
other words, AntCO
2
offers a service to DEDIS.


4.2. Communication
4.2.1. General concept
As AntCO
2
is accessed by DEDIS only as a service, the communication
between both parts is unidirectional. This means that DEDIS can launch
request to AntCO
2
, but the contrary is not permitted.

DEDIS Master DEDIS Slave
AntCO2
{1}
{1..*}
{1}

Figure 10: Communication graph with AntCO
2
The figure above shows in what direction the communication takes place,
but it hides the fact that a given slave is communicating with other slaves at
the same time and that AntCO
2
may be distributed. As a matter of fact, a
single AntCO
2
part might manage different slaves.

We use the ProActive API
4
as the communication layer between any
distributed parts. It allows us to synchronize AntCO
2
easily with the slaves
and gain a maximum of transparency between the two applications.


3
http://www-lih.univ-lehavre.fr/~dutot/Ess2003/index.html
4
http://www-sop.inria.fr/oasis/ProActive

PructicuI Truininq Report

Page 22 of 35

Viewed from the side of the application layer, each DEDIS slave has to
inform AntCO
2
about what happens in the simulation and must hold a
remote interface to an AntCO
2
part.

As can be observed on the figure below, the DEDIS application forms a
complete graph. This means that every node knows all other nodes and has
to communicate with them for example if an entity wants to know its
neighbours, its slave will query all other slaves in order to satisfy the
request.

DEDIS Master
DEDIS Slave
AntCO2
DEDIS Slave
DEDIS Slave
DEDIS Slave
DEDIS Slave
DEDIS Slave
AntCO2
AntCO2
DEDIS Slave

Figure 11: Communication scheme between AntCO
2
and DEDIS

As for AntCO
2
, the final architecture has not yet been set up. Nevertheless,
the most generic way of looking at it, is to consider AntCO
2
as distributed in
some way and as connected to it through ProActive. This approach ensures
that communication is completely independent of AntCO
2
s later
architecture.

Using ProActive also offers the advantage that it leaves the infrastructure
used by DEDIS and AntCO
2
independent. As shown on the figure above,
there may be, for example, 7 DEDIS slaves and only 3 AntCO
2
parts. Again,
as each part is distributed, this does not fix which part runs on which
machine.


PructicuI Truininq Report

Page 23 of 35

4.2.2. Scenario
A possible communication scenario between DEDIS and AntCO
2
could look
like this:

1. AntCO
2
starts up with an empty graph and registers itself in the RMI
registry.

2. DEDIS Master starts up, registers it self in the RMI registry and
connects to the AntCO2 master part.

3. Each time a DEDIS slave in launched and registers with the DEDIS
master, the latter adds a color to AntCO
2
and passes the returned URI
to the slave. The slave then has to establish a connection to the given
AntCO
2
part.

4. The simulation is initialized and started. The following events might
happen:

An entity delegates a migration suggestion.

An entity decides to migrate to another node.

A new entity is born.

An active entity dies and disappears.

...

Each of these events needs to communicate with the node's relative
AntCO
2
part, either to inform it about changes or to query it for a new
color suggestion.




PructicuI Truininq Report

Page 24 of 35

4.2.3. Mutual blocking problem
When moving a node from one slave to another, AntCO
2
must be informed
of this movement. In order to maintain a minimum of synchronisation
between DEDIS and AntCO
2
, the migration of the node on AntCO
2
must
terminated before the DEDIS slave continues its work.

If AntCO
2
migrates the given node faster than DEDIS, no problem will arise,
but if the migration on AntCO
2
is slower than the one on DEDIS, the latter
has to wait for AntCO
2
.

In order to solve this problem, we use ProActive's support for asynchronous
method invocation through future objects [PRO04, p. 16]. By changing the
method "moveNodeTo" to return a result-object and by making the DEDIS
slave access it, the latter will block until AntCO
2
has finished its migration.


DEDIS slave
AntCO2
moveNodeTo
return Future
return Value
Migrate node
Migrate node
{block}

Figure 12: Synchronization by ProAvtive




PructicuI Truininq Report

Page 25 of 35

4.2. Interface
AntCO
2
has to offer two distinct interfaces to the outside. The first one is a
general administrative interface, named "InterfaceAntCO2Master", through
which colors can be added, removed or updated. The "weight parameter
quantifies the computational power of a node.


InterfaceAntCO2Master
+addColor(colorID:String,weight:double): String
+removeColor(idColor:String): void
+updateColor(colorID:String,weight:double): void

Figure 13: Interface to AntCO
2
-Master part

The second interface, which is tied directly to a certain color, allows the adding
and removing of nodes. These can also be moved or connected to other nodes.
The "suggestColor" method queries AntCO
2
s state of a given node. Beside the
suggested color, the result contains a kind of trust-index which is a self-
evaluation of AntCO
2
s color suggestion. This is needed because it is possible
that a node may switch permanently between two colors. This "blinking is then
detected by AntCO
2
and decreases the returned trust-index on which the caller
entity will base its decision whether it should migrate or not. Indeed, AntCO
2

only makes a proposal and the considered entity has no obligation to follow it.


InterfaceAntCO2Slave
+addNode(nodeID:int): void
+removeNode(nodeID:int): void
+moveNodeTo(nodeID:int,colorID:String): BooleanBox
+addConnection(fromNode:int,toNode:String,weight:int): void
+removeConnection(fromNode:int,toNode:int): void
+updateConnection(fromNode:int,toNode:int,weight:int): void
+suggestColor(nodeID:int): ColorSuggestion

Figure 14: Interface to AntCO
2
-Slave part







PructicuI Truininq Report

Page 26 of 35

5. SIMULATION RESULTS

This chapter will describe the different results obtained during simulation runs.


5.1. Scalability
During the setup of the distributed simulation environment, different
performance tests were executed in order to study the overall behaviour of the
simulation as a function of the number of entities on each machine.

Early experiments consisted of simulation runs on a single machine with fixed
parameters (meaning a fixed number of entities), the same execution
environment (the simulation was considered as the only really active process on
the machine), the same kind of entities, and an identical field size. As this
experiment was running on a single machine, there was no need to take into
account external communication. Fixing all these parameters allowed the testing
of different communication independent algorithms such as for example the
entity scheduling and execution algorithm on each slave.

Each time acceptable results where obtained for a given set of parameters, the
number of entities on the machine were increased in order to perceive how the
behaviour of the machine was evolving.

In a second phase, the tests where enlarged to more than one machine, thereby
introducing network communications. The rest of the parameters were held
constant, except for the number of entities per machines. Unfortunately, the first
results were very poor. Simulation runs were extremely slow and, as the number
of entities rose, the time it took to simulate a given number of cycles grew
exponentially. Therefore I introduced some caching mechanisms for repeated
requests and allowed the different slaves to intercommunicate better.

The next figure shows how the time response of the simulation evolves if the
number of participating machines as well as the environment size is constant
and only the number of entities change. For this experiment I used as master
machine a Pentium IV M 1,8GHz and five Pentiums IV 2,4Ghz slave machines,
each equipped with 512MB of RAM. The machines were connected to a 100MBit
switched Ethernet network.

Only one kind of entities was used for this simulation. They were designed to
consider all neighbours in a range of 20 cells and then to move towards the
closest. Although this implies a huge number of communication with surrounding
entities, the saturation of the network did not surpass 5%.


PructicuI Truininq Report

Page 27 of 35

As can be observed, the behaviour is likely to be linear. It is also worthy to
notice that, as only the number of entities per machine is drawn on the X-axes,
the absolute difference between two steps is 500 entities (because there are five
participating slaves).
500x500, 100 cycles, log
0
10000
20000
30000
40000
50000
60000
0 100 200 300 400 500 600 700 800 900 1000 1100
entites per machine
t

(
m
s
)
run 1
run 2

Figure 15: Simulation time response versus number of entities per machine

At the time I performed these runs, the simulation contained two type of
entities: the predators and the preys. Both of them were able to see other
entities within a certain radius. The preys were trying to form groups with other
preys and to run away from predators. The latter, of course, were chasing the
preys. Because I wanted to take measures in constant conditions during the
experiment runs, predators were not allowed to eat the caught preys.



PructicuI Truininq Report

Page 28 of 35

5.2. Entities
The next stage consisted of analysing more closely the behaviour of the entities,
especially emerging group effects. For this, I set up different rules, for example
the moving toward entities of the same type, the moving away from enemies or
the reaching of a certain location.


5.2.1. Grouping
Based on Reynolds [RAY87] "boids", as described in [JLP04] I tried to
obtain first of all some grouping effect. Depending on several parameters,
as for example the repulsion coefficient, the attraction coefficient or the
view angle and distance, the grouping effect is stronger.



Figure 16: Emerging groups

5.2.2. Ungrouping
At the next stage, I added more entities and slightly changed the different
parameters and rules. Then, hoping to see some prey groups splitting and
rejoining, I also let out some predators as well.

PructicuI Truininq Report

Page 29 of 35


Figure 17: Group splitting

On the above figure one can see a predator (the big spot) approaching a
group. As it moves closer to the group, the preys try to. This leads to a
division of the group into two.


5.2.3. Obstacles
Next, I introduced obstacles to the simulation. These are entities like the
predators and the preys, but they do not move. Obstacles are thus passive
entities which are influencing others through their presence.

The figure above shows two obstacles surrounded by a group of preys. The
major difficulty was to tweak the repulsion coefficient in such way that the
entities did not run against the obstacle but were allowed to enclose it.


Figure 18: Entities around an obstacle


PructicuI Truininq Report

Page 30 of 35

5.2.4. Inactive entities
An obstacle is modelled as being an ensemble of distinct passive entities.
Thus the total number of entities in a simulation grows very fast when a
lot of obstacles are added. The latter may not need any processing power;
therefore I decided to give every entity a flag which tells the simulator if it
is active of passive. Depending on this flag the entity is executed or not.
Hence the overall performance of simulations with multiple obstacles
increased.

Although this works well enough, I am not satisfied with this solution.
Rather than defining an obstacle as an ensemble of entities, I would prefer
to give it a fixed size. This would imply different changes to the global
entity model. For example, a single entity would be allowed to occupy
more than one cell. Going this way, rectangle obstacles can be defined as
single entities with a width and a height attribute. More generally, any
specialized obstacle can be set up by implementing a certain
"ObstacleInterface" in order to make it compatible with the rest of the
simulation.


5.3. Communication graphs
Each simulation run writes down its log files. These log files contain information
about the position of an entity and about the communication top other entities
for each cycle. For convenience and compatibility I chose the XML format, which
also presents the advantage that the log files remain easily readable.

In order to test AntCO
2
, the log files had to be transformed into dynamic graphs.
After this step, two different tests were realized: the first one introduces the
dynamic graphs into the AntCO
2
's graph visualizer and the second one puts the
same dynamic graph into the ant analyser.


PructicuI Truininq Report

Page 31 of 35


Figure 19: DEDIS post-simulation visualization

The above figure (Figure 19) shows the effective positions of a given test
simulation run. The import of the associated dynamic graph inside AntCO
2
's
graph visualizer can be seen on the figure below (Figure 20). The left image
represents the initial situation, whereas the right one shows the positions,
respectively the communications, of step 92.


Figure 20: AntCO
2
graph visualizer

One can easily see that the left-hand side figure is constituted of a lot of isolated
entities. However, later on they form different clusters with heavy
communication connections.

PructicuI Truininq Report

Page 32 of 35


On the next figure, ants are let out to populate this dynamic graph and detect
the different clusters. On the figure beneath, there are four different ant
populations. Some clusters are colonized by two or more different populations.
This is due to the agitated nature of the dynamic graph, which comes originally
from the simulation. Thus clusters join und split very fast.


157
92
118
187
115
185
159
94
68
135
67
136
52
96
95
69
133
32
76
152
155
13
117
132
180
14
17
36
16
37
55
33
15
158
161
114
184
53
31
54
48
186
181
93 179
130
160
102
173
39
103
112
72
50
34
113
131
150
74
18
111
57
199
35
177
174
175
19
51
176
154
70
151
137
80
99
134
56
40
77
89
119
97
38
178
156
75
71
138
21
116 22
139
73
153
98
91
63
143
148
127
141
78
146
194
125
167
86
20
42
145
64
126
43
196
144 87
79
44
9
90
88
41
198
149
195
58
122
60
6
59
5
192
62
193
172
7
191
106
65
107
197
110
8
147
66
29
165
0
190
49
82
83
24
61
121
23
170
182
120
183
101
30
171
189
25
188
166
123
26
100
164
169
28
85
105
3
129
27
1
46
163
11
104
4
84
124
12
45
168
2
81
109
162
47
140
108
10
128
142

157
92
118
187
115
185
159
94
68
135
67
136
52
96
95
69
133
32
76
152
155
13
117
132
180
14
17
36
16
37
55
33
15
158
161
114
184
53
31
54
48
186
181
93
179
130
160
102
173
39
103
112 72
50
34
113
131
150
74
18
111
57
199
35
177
174
175
19
51
176
154
70
151
137
80
99
134
56 40
77
89
119
97
38
178
156
75
71
138
21
116
22
139
73
153
98
91
63
143
148
127
141
78
146
194
125
167
86
20
42
145
64
126
43
196
144
87
79
44
9
90
88
41
198
149
195
58
122
60
6
59
5
192 62
193
172
7
191
106
65
107
197
110
8
147
66
29
165
0
190
49
82
83
24
61
121
23
170
182
120
183
101
30
171
189
25
188
166
123
26
100
164
169
28
85
105
3
129
27
1
46
163
11
104
4
84
124
12
45
168
2
81
109
162
47
140
108
10
128
142

Figure 21: AntCO
2
analyze

These analyses are somewhat "static". In fact, the ant's movements are
synchronized with the dynamic changes of the graph. This means that a
calculation phase for the ants is followed immediately by a simulation cycle
which is expressed by a set of changes that occur inside the graph.

Once DEDIS and AntCO
2
runs in an active communication mode, meaning that
AntCO
2
will analyze the simulation's communication graph in real-time, both
parts will be decoupled and will be no longer synchronous. This implies that ants
might move faster or slower, the first being a positive result, the second a
negative one.




PructicuI Truininq Report

Page 33 of 35

6. CONCLUSION

6.1. The project
This project has been very fascinating and rewarding. The dynamic side, the
distribution of the entities during the simulation, was especially challenging.
Even during harder periods, when dealing with the distribution of the simulation
itself and with a lot of node inter-blocking problems, it allowed me to develop
my personal skills.

But other parts of the project turned in a kind of adventure, such as doing a lot
of research and trying out many possibilities until a way was found to make it
operational and coherent. This concerns above all the modelling of the entities.
Incoherent, uncontrolled or unpredictable movements were just some of the
problems encountered. It was not always easy to find a solution.

Whichever way I look at it, this was a really interesting and fascinating project.


6.2. General conclusion
I am very satisfied with the experience gained during this practical training. I
was given the possibility to work hand in hand with other researchers inside a
laboratory, learning about essentials in research.

Working in a laboratory hand in hand with other researchers, gave me the
opportunity to learn about essential research techniques. I was able to fill gaps
concerning document research and project interfacing. Furthermore, I was
shown once again how important good communication between group members
working on related projects is.

Last but not least, I want to mention how pleased I was to work on the given
project. Nevertheless I know that theory is the key to every project and I have
to admit that I am not a great theorist, which is why the implementation part,
and thus the fact to have achieved something, was very important to me.
Encountering concrete problems and difficulties motivated me the most.

PructicuI Truininq Report

Page 34 of 35

7. REFERENCES & ACRONYMS


7.1. Acronyms
API Application Programmer Interface
HTTP HyperText Transport/Transfer Protocol
LAN Local Area Network
RPC Remote Procedure Call
RMI Remote Method Invocation
SQL Structured Query Language
WWW World Wide Web
XML eXtensible Markup Language


7.2. Links
Universitof le Havre http://www.univ-lehavre.fr

Laboratory of Informatics http://www-lih.univ-lehavre.fr

ProActive API http://www-sop.inria.fr/oasis/ProActive

AntCO
2
http://www-lih.univ-lehavre.fr/~dutot/Ess2003/



PructicuI Truininq Report

Page 35 of 35

7.3. References
[BDG03] BERTELLE Cyrille, DUTOT Antoine, GUINAND Frdric, OLIVIER
Damien, In DOA 2003 International Symposium on Distributed
Objects and Applications, Catania (Sicile) October 2003, [online]
http://www-lih.univ-lehavre.fr/~dutot/Papers/doa2003.pdf

[BDG04] BERTELLE Cyrille, DUTOT Antoine, GUINAND Frdric, OLIVIER
Damien, Distribution of Agent Based Simulation with Colored Ant
Algorithm, In ESS 2002 European Simulation Symposium, Pages 39-
43, Dresden (Germany) October 2002, [online] http://www-lih.univ-
lehavre.fr/~dutot/Papers/ess2002.ps

[BRU98] BRUNETT Sharon, FITZGERALD Steven: Metacomputing Supports
Large-Scale Distributed Simulation, CACR - 162, May 1998, [online]
http://www.cacr.caltech.edu/SFExpress/pubs/sc98/sc98.html

[DIE98] MALPICA Diego, RUDMN Isaac, Object oriented Architecture for
distributed virtual reality, Proceeding of the 2 Virtual and Intelligent
Environments Workshop, Xalapa,Estado de Veracruz, Sept. 10-14,
1998, [online] http://www.uhd.edu/academic/colleges/sciences/ccsds/
grants/mexico/papers/2view/4/cont.doc

[JLP04] PONTY Jean-Luc, OLIVIER Damien: Vie artificielle Les boids

[MOR96] MORSE Katherine L.: Interest Management in Large-Scale Distributed
Simulations, University of California, Irvine Technical Report TR 96-27,
1996, [online] http://www.cs.nott.ac.uk/~mhl/archive/Morse:96a.pdf

[PRO04] INRIA Sophia Antipolis, ProActive Manual, April 2004, [online]
http://www-sop.inria.fr/oasis/ProActive/doc/ProActiveManual.pdf

[REY87] REYNOLDS Craig: Flocks, herds and schools: A distributed behavioral
model. Computer Graphics, 21(4):25-34.

[REY01] REYNOLDS Craig: Boids, Background and Update, 6 September 2001
[online] http://www.red3d.com/cwr/boids/

You might also like