Lecture+32 33+Data+Mining+With+ACO

DATA MINING WITH ANT COLONY
OPTIMIZATION
1
ANT ALGORITHM
Another Collective Intelligence Approach
Ants appeared on earth some 100 million years ago
They have a current total population estimated at 1016

individuals
Most of these ants are social insects living in colonies

(population may vary from 30 to millions of individuals)
2
ANT ALGORITHM
Ants appeared on earth some 100 million years ago
They have a current total population estimated at 1016

individuals
Most of these ants are social insects living in colonies

(population may vary from 30 to millions of individuals)
3
ANT ALGORITHM
For several decades researchers have been fascinated by the

emergent behavior of colonies of social insects, such as ants
Despite the relative simplicity of an individual's behavior, the

colony as a whole can exhibit highly adaptive behavior,
leading researchers to consider the adaptation of these
collective processes for use in the area of artificial intelligence
4
ANT ALGORITHM
Food Foraging Behaviour
Observations of the foraging behavior of ants has inspired

the development of ant-based algorithms
These algorithms are used to solve mainly combinatorial

optimization problems defined over discrete search spaces
One of the first behaviors studied by ethologists was the

ability of ants to find the shortest path between their nest and
a food source. These studies resulted in the first algorithmic
models of the foraging behavior of ants (Dorigo)
5
ANT ALGORITHM
How do ants find the shortest path between their nest and
food source, without any visible, central, active coordination
mechanism?
Studies have shown that the search for food is random in the
beginning
As soon as a food source is located activity becomes more

organized with more and more ants following the same
(shortest) path to the food source
6
ANT ALGORITHM
How do ants find the shortest path between their nest and
food source, without any visible, central, active coordination
mechanism?
Studies have shown that the search for food is random in the
beginning
As soon as a food source is located activity becomes more

organized with more and more ants following the same
(shortest) path to the food source
7
ANT ALGORITHM
8
ANT ALGORITHM
Ant Algorithm
An artificial ant can be considered as a simple computational

agent, A logic is implemented in the artificial ant to select a
path whenever there are several paths which can be followed
A number of artificial "ants" construct solutions to the

problem at hand by the repeated selection of parts from a
predefined set of solution components
These ants select components probabilistically biased by

heuristic information (a problem specific heuristic measure
of a component's utility) and pheromone information,
typically associated with the solution components
9
ANT ALGORITHM
Ant Algorithm
To simulate the real-world process by which ants find the

shortest path to a food source, electronic ants deposit
pheromone on components in proportion to the quality of the
solutions that contain them
This is similar to a shorter path receiving pheromone

reinforcement sooner than a longer path (as ants return from
the food source sooner), thereby increasing the likelihood
that later ants will choose the shorter path over the longer
one
10
ANT ALGORITHM
Ant Algorithm
ACO has been applied to solve many problems

Well suited to discrete optimization problems
• Job scheduling
• Subset problems
• Network routing
• Vehicle routing
• Bioinformatics
• Data mining
For the application of ACO to a problem we need

• A representation of solution
• A method to determine the fitness of the solution
• A heuristic measure for the solution’s component
11
ANT ALGORITHM
Ant Algorithm
Consider the general problem of finding the shortest path

between two nodes on a graph, G = (V, E), where V is the set of
vertices (nodes) and E is a matrix representing the connections
between nodes
12
ANT ALGORITHM
Ant Algorithm and the TS problem
The Traveling Salesperson problem (TSP) is to find for n

given cities a shortest closed tour that contains every city
exactly once
In each generation each of m ants constructs one solution
An ant starts from a random city and iteratively moves to

another city until the tour is complete and the ant is back at
its starting point
13
ANT ALGORITHM
When an ant decides which town to move to next, it does so

with a probability that is based on the distance to that city
and the amount of pheromone on the connecting edge
Let dij be the distance between the cities i and j
The probability that the ant chooses j as the next city after it
has arrived at city i where j is in the set S of cities that have
not been visited is
14
ANT ALGORITHM
Here ij is the amount of pheromone on the edge (ij)
ij = 1/dij is a heuristic value,
and  and  are constants that determines the relative

influences of the pheromone value and that of the heuristic
value on the decision of the ant
15
ANT ALGORITHM
The solution may have been even better, if the city had been
replaced by another city
But now, due to high pheromone value it will continue to

have a strong probability of selection for a considerable
length of time (until pheromone values on cities in
competition become much higher)
A high level of pheromones will be associated only with those

link that continuously get selected in good solutions
16
DATA MINING WITH AN ANT COLONY
OPTIMIZATION ALGORITHM
• The goal of data mining is to discover knowledge that

is not only accurate, but also comprehensible for the
user
• Comprehensibility is important whenever discovered

knowledge will be used for supporting a decision made
by a human user
• If discovered knowledge is not comprehensible, it can

lead to incorrect decisions
CLASSIFICATION PROCESS: MODEL
CONSTRUCTION
Classification
Algorithms
Training
Data
NAME RANK YEARS TENURED Classifier

Mike Assistant Prof 3 no (Model)
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes
IF rank = ‘professor’
Dave Assistant Prof 6 no
OR years > 6
Anne Associate Prof 3 no
THEN tenured = ‘yes’
• This algorithm Ant Miner (ant colony based data miner)

algorithm was proposed by Rafael S. Parpinelli
• The goal is to extract classification rules by using Ant
Colony Optimization Algorithm
• The performance of Ant-Miner is compared with CN2, a
well-known data mining algorithm for classification, in
six public domain data sets
• In classification, discovered knowledge is often

expressed as
IF < condition > THEN < class >
• Each condition is referred as a term, so that the rule
antecedent is a logical conjunction of terms, such as
IF term1 AND term2 AND…..
• Each term is a triple < Attribute, Operator, Value>,
such as <Gender = male>
ANT-MINER
• The current version of Ant-Miner copes only with

categorical attributes, so that the operator element is
always “=“
• Continuous (real-valued) attributes are discretized in a
preprocessing step
• It follows a sequential covering approach to discover a
list of classification rules covering all, or almost all, the
training cases
Ant Miner has five major modules:
• General description
• Heuristic function
• Rule pruning
• Pheromone updating
• Use of discovered rules for classifying new cases
Ant Miner has five major modules:
• General description
• Heuristic function
• Rule pruning
• Pheromone updating
• Use of discovered rules for classifying new cases
Search Space
Algorithm
Algorithm
Algorithm
Term Selection Probability
Where:
• Pij is the probability of a term that is candidate for selection in

the current partial rule
• ηij is the value of heuristic function
• τij(t) is the amount of pheromone associated with a term
• a is the total number of attributes
• bi is the number of values in the domain of ith attribute
Heuristic Function
For each termij of the form Ai = Vij ,where Ai is the ith

Attribute and Vij is the jth value belonging to the domain
of Ai , its entropy is:
Where
• w is the class attribute
• k is the number of classes in the domain of class attribute
• Ai is the i-th attribute
• Vij is the jth value of the domain of the attribute Ai
• P(w | Ai = Vij) is the empirical probability of observing class w
conditional on having observe Ai = Vij
29
Heuristic Function
The proposed normalized informations theoratic

heuristic function is
Where:
• a is the total number of attributes
• xi is set to 1 if the attribute Ai was not yet used by the current ant
• bi is the number of values in the domain of ith attribute
30
Rule Pruning
• Goal of RP is to remove irrelevant terms from the

rule
• Rule pruning potentially increases the predictive

power of the rule, helping to avoid its over-fitting
• Rule pruning also improves the simplicity of the rule

Rule Pruning
• The basic idea is to iteratively remove one term at a time

from the rule while this process improves the quality of
the rule
• In the next iteration, the term whose removal most

improves the quality of the rule is again removed and so
on
• This process is repeated until the rule has just one term
or until there is no term whose removal will improve the
quality of the rule
Pheromone Updating
The initial amount of pheromone deposited at each path

position is inversely proportional to the number of values
of all attributes and is defined by
Pheromone Updating
• The amount of pheromone associated with each

termij occurring in the rule found by the ant (after
pruning) is increased in proportion to the quality of
that rule
• This corresponds to increasing the probability of

termij being chosen by other ants in the future in
proportion to the quality of the rule
Pheromone Updating
Pheromone updating for a termij is performed for all

terms termij that occur in the rule
R is the set of terms occurring in the rule constructed by

the ant at iteration t
Rule Quality
• The quality of a rule is computed according to:
• TP is the number of cases covered by the rule that have

the same class label as the rule
• FP is the number of cases covered by the rule that have
class label different than that of the rule
• FN is the number of cases that are not covered by the rule
but have the same class label as that of the rule
• TN is the number of cases that are not covered by the rule
and which do not have the same class label as that of the
rule
EXPERIMENTS
EXPERIMENTS
CONCLUSION
• As compared to CN2, the predictive accuracy of Ant

Miner gives better results in four data sets while CN2
obtained better results in one data set
• On the other hand, Ant Miner has consistently found

much smaller rule lists than CN2

Lecture+32 33+Data+Mining+With+ACO

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture+32 33+Data+Mining+With+ACO

Uploaded by

Copyright:

Available Formats

DATA MINING WITH ANT COLONY

Another Collective Intelligence Approach

Ants appeared on earth some 100 million years ago

They have a current total population estimated at 1016

Most of these ants are social insects living in colonies

Another Collective Intelligence Approach

Ants appeared on earth some 100 million years ago

They have a current total population estimated at 1016

Most of these ants are social insects living in colonies

Another Collective Intelligence Approach

For several decades researchers have been fascinated by the

Despite the relative simplicity of an individual's behavior, the

Food Foraging Behaviour

Observations of the foraging behavior of ants has inspired

These algorithms are used to solve mainly combinatorial

One of the first behaviors studied by ethologists was the

Food Foraging Behaviour

As soon as a food source is located activity becomes more

Food Foraging Behaviour

As soon as a food source is located activity becomes more

An artificial ant can be considered as a simple computational

A number of artificial "ants" construct solutions to the

These ants select components probabilistically biased by

To simulate the real-world process by which ants find the

This is similar to a shorter path receiving pheromone

ACO has been applied to solve many problems

For the application of ACO to a problem we need

Consider the general problem of finding the shortest path

Ant Algorithm and the TS problem

The Traveling Salesperson problem (TSP) is to find for n

In each generation each of m ants constructs one solution

An ant starts from a random city and iteratively moves to

Ant Algorithm and the TS problem

When an ant decides which town to move to next, it does so

Let dij be the distance between the cities i and j

Ant Algorithm and the TS problem

Here ij is the amount of pheromone on the edge (ij)

ij = 1/dij is a heuristic value,

and  and  are constants that determines the relative

Ant Algorithm and the TS problem

But now, due to high pheromone value it will continue to

A high level of pheromones will be associated only with those

• The goal of data mining is to discover knowledge that

• Comprehensibility is important whenever discovered

• If discovered knowledge is not comprehensible, it can

NAME RANK YEARS TENURED Classifier

• This algorithm Ant Miner (ant colony based data miner)

• In classification, discovered knowledge is often

• The current version of Ant-Miner copes only with

Ant Miner has five major modules:

Ant Miner has five major modules:

Term Selection Probability

• Pij is the probability of a term that is candidate for selection in

For each termij of the form Ai = Vij ,where Ai is the ith

The proposed normalized informations theoratic

• Goal of RP is to remove irrelevant terms from the

• Rule pruning potentially increases the predictive

• Rule pruning also improves the simplicity of the rule

• The basic idea is to iteratively remove one term at a time

• In the next iteration, the term whose removal most

The initial amount of pheromone deposited at each path

• The amount of pheromone associated with each

• This corresponds to increasing the probability of