You are on page 1of 39

DATA MINING WITH ANT COLONY

OPTIMIZATION

1
ANT ALGORITHM

Another Collective Intelligence Approach

Ants appeared on earth some 100 million years ago

They have a current total population estimated at 1016


individuals

Most of these ants are social insects living in colonies


(population may vary from 30 to millions of individuals)

2
ANT ALGORITHM

Another Collective Intelligence Approach

Ants appeared on earth some 100 million years ago

They have a current total population estimated at 1016


individuals

Most of these ants are social insects living in colonies


(population may vary from 30 to millions of individuals)

3
ANT ALGORITHM

Another Collective Intelligence Approach

For several decades researchers have been fascinated by the


emergent behavior of colonies of social insects, such as ants

Despite the relative simplicity of an individual's behavior, the


colony as a whole can exhibit highly adaptive behavior,
leading researchers to consider the adaptation of these
collective processes for use in the area of artificial intelligence

4
ANT ALGORITHM

Food Foraging Behaviour

Observations of the foraging behavior of ants has inspired


the development of ant-based algorithms

These algorithms are used to solve mainly combinatorial


optimization problems defined over discrete search spaces

One of the first behaviors studied by ethologists was the


ability of ants to find the shortest path between their nest and
a food source. These studies resulted in the first algorithmic
models of the foraging behavior of ants (Dorigo)

5
ANT ALGORITHM

Food Foraging Behaviour

How do ants find the shortest path between their nest and
food source, without any visible, central, active coordination
mechanism?

Studies have shown that the search for food is random in the
beginning

As soon as a food source is located activity becomes more


organized with more and more ants following the same
(shortest) path to the food source

6
ANT ALGORITHM

Food Foraging Behaviour

How do ants find the shortest path between their nest and
food source, without any visible, central, active coordination
mechanism?

Studies have shown that the search for food is random in the
beginning

As soon as a food source is located activity becomes more


organized with more and more ants following the same
(shortest) path to the food source

7
ANT ALGORITHM

8
ANT ALGORITHM

Ant Algorithm

An artificial ant can be considered as a simple computational


agent, A logic is implemented in the artificial ant to select a
path whenever there are several paths which can be followed

A number of artificial "ants" construct solutions to the


problem at hand by the repeated selection of parts from a
predefined set of solution components

These ants select components probabilistically biased by


heuristic information (a problem specific heuristic measure
of a component's utility) and pheromone information,
typically associated with the solution components
9
ANT ALGORITHM

Ant Algorithm

To simulate the real-world process by which ants find the


shortest path to a food source, electronic ants deposit
pheromone on components in proportion to the quality of the
solutions that contain them

This is similar to a shorter path receiving pheromone


reinforcement sooner than a longer path (as ants return from
the food source sooner), thereby increasing the likelihood
that later ants will choose the shorter path over the longer
one

10
ANT ALGORITHM

Ant Algorithm

ACO has been applied to solve many problems


Well suited to discrete optimization problems
• Job scheduling
• Subset problems
• Network routing
• Vehicle routing
• Bioinformatics
• Data mining

For the application of ACO to a problem we need


• A representation of solution
• A method to determine the fitness of the solution
• A heuristic measure for the solution’s component
11
ANT ALGORITHM

Ant Algorithm

Consider the general problem of finding the shortest path


between two nodes on a graph, G = (V, E), where V is the set of
vertices (nodes) and E is a matrix representing the connections
between nodes

12
ANT ALGORITHM

Ant Algorithm and the TS problem

The Traveling Salesperson problem (TSP) is to find for n


given cities a shortest closed tour that contains every city
exactly once

In each generation each of m ants constructs one solution

An ant starts from a random city and iteratively moves to


another city until the tour is complete and the ant is back at
its starting point

13
ANT ALGORITHM

Ant Algorithm and the TS problem

When an ant decides which town to move to next, it does so


with a probability that is based on the distance to that city
and the amount of pheromone on the connecting edge

Let dij be the distance between the cities i and j

The probability that the ant chooses j as the next city after it
has arrived at city i where j is in the set S of cities that have
not been visited is

14
ANT ALGORITHM

Ant Algorithm and the TS problem

Here ij is the amount of pheromone on the edge (ij)

ij = 1/dij is a heuristic value,

and  and  are constants that determines the relative


influences of the pheromone value and that of the heuristic
value on the decision of the ant

15
ANT ALGORITHM

Ant Algorithm and the TS problem

The solution may have been even better, if the city had been
replaced by another city

But now, due to high pheromone value it will continue to


have a strong probability of selection for a considerable
length of time (until pheromone values on cities in
competition become much higher)

A high level of pheromones will be associated only with those


link that continuously get selected in good solutions

16
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

• The goal of data mining is to discover knowledge that


is not only accurate, but also comprehensible for the
user

• Comprehensibility is important whenever discovered


knowledge will be used for supporting a decision made
by a human user

• If discovered knowledge is not comprehensible, it can


lead to incorrect decisions
CLASSIFICATION PROCESS: MODEL
CONSTRUCTION

Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier


Mike Assistant Prof 3 no (Model)
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes
IF rank = ‘professor’
Dave Assistant Prof 6 no
OR years > 6
Anne Associate Prof 3 no
THEN tenured = ‘yes’
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

• This algorithm Ant Miner (ant colony based data miner)


algorithm was proposed by Rafael S. Parpinelli
• The goal is to extract classification rules by using Ant
Colony Optimization Algorithm
• The performance of Ant-Miner is compared with CN2, a
well-known data mining algorithm for classification, in
six public domain data sets
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

• In classification, discovered knowledge is often


expressed as
IF < condition > THEN < class >
• Each condition is referred as a term, so that the rule
antecedent is a logical conjunction of terms, such as
IF term1 AND term2 AND…..
• Each term is a triple < Attribute, Operator, Value>,
such as <Gender = male>
ANT-MINER

• The current version of Ant-Miner copes only with


categorical attributes, so that the operator element is
always “=“
• Continuous (real-valued) attributes are discretized in a
preprocessing step
• It follows a sequential covering approach to discover a
list of classification rules covering all, or almost all, the
training cases
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Ant Miner has five major modules:

• General description
• Heuristic function
• Rule pruning
• Pheromone updating
• Use of discovered rules for classifying new cases
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Ant Miner has five major modules:

• General description
• Heuristic function
• Rule pruning
• Pheromone updating
• Use of discovered rules for classifying new cases
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Search Space
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Algorithm
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Algorithm
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Algorithm
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Term Selection Probability

Where:

• Pij is the probability of a term that is candidate for selection in


the current partial rule
• ηij is the value of heuristic function
• τij(t) is the amount of pheromone associated with a term
• a is the total number of attributes
• bi is the number of values in the domain of ith attribute
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Heuristic Function

For each termij of the form Ai = Vij ,where Ai is the ith


Attribute and Vij is the jth value belonging to the domain
of Ai , its entropy is:

Where
• w is the class attribute
• k is the number of classes in the domain of class attribute
• Ai is the i-th attribute
• Vij is the jth value of the domain of the attribute Ai
• P(w | Ai = Vij) is the empirical probability of observing class w
conditional on having observe Ai = Vij
29
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Heuristic Function

The proposed normalized informations theoratic


heuristic function is

Where:
• a is the total number of attributes
• xi is set to 1 if the attribute Ai was not yet used by the current ant
• bi is the number of values in the domain of ith attribute

30
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Rule Pruning

• Goal of RP is to remove irrelevant terms from the


rule

• Rule pruning potentially increases the predictive


power of the rule, helping to avoid its over-fitting

• Rule pruning also improves the simplicity of the rule


DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Rule Pruning

• The basic idea is to iteratively remove one term at a time


from the rule while this process improves the quality of
the rule

• In the next iteration, the term whose removal most


improves the quality of the rule is again removed and so
on

• This process is repeated until the rule has just one term
or until there is no term whose removal will improve the
quality of the rule
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Pheromone Updating

The initial amount of pheromone deposited at each path


position is inversely proportional to the number of values
of all attributes and is defined by
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Pheromone Updating

• The amount of pheromone associated with each


termij occurring in the rule found by the ant (after
pruning) is increased in proportion to the quality of
that rule

• This corresponds to increasing the probability of


termij being chosen by other ants in the future in
proportion to the quality of the rule
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Pheromone Updating

Pheromone updating for a termij is performed for all


terms termij that occur in the rule

R is the set of terms occurring in the rule constructed by


the ant at iteration t
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM

Rule Quality
• The quality of a rule is computed according to:

• TP is the number of cases covered by the rule that have


the same class label as the rule
• FP is the number of cases covered by the rule that have
class label different than that of the rule
• FN is the number of cases that are not covered by the rule
but have the same class label as that of the rule
• TN is the number of cases that are not covered by the rule
and which do not have the same class label as that of the
rule
EXPERIMENTS
EXPERIMENTS
CONCLUSION

• As compared to CN2, the predictive accuracy of Ant


Miner gives better results in four data sets while CN2
obtained better results in one data set

• On the other hand, Ant Miner has consistently found


much smaller rule lists than CN2

You might also like