You are on page 1of 25

Machine Learning

BY:Vatsal J. Gajera
(09BCE010)
 What is Machine Learning?
It is a branch of artificial intelligence.It is
a scientfic discipline concerned with the
design and development of algorithms
that allow computers to evolve behaviours
based on empirical data.Such as from
sensors and data bases.
 Techanical Definition of
machine learning:

According to Tom M. Mitchell, a computer


is said to learn from experience E with
respect to some class of tasks T and
performance measure P,if its performance
at tasks in T, as measured by P improves
with experience.
A major focus of machine learning
research is to automatically learn to
recognize complex patterns and make
intelligent decisions based on data; the
difficulty lies in the fact that the set of all
possible behaviors given all possible
inputs is too large to be covered by the
set of observed examples (training data).
Hence the learner must generalize from
the given examples, so as to be able to
produce a useful output in new cases.
 Some machine learning systems attempt
to eliminate the need for human
interaction in data analysis, while others
adopt a collaborative approach between
human and machine. Human intuition
cannot, however, be entirely eliminated,
since the system's designer must specify
how the data is to be represented and
what mechanisms will be used to search
for a characterization of the data.
 Application of machine
learning:
 Search Engines.
 Medical Diagnosis.
 Stock Market Analysis.
 Game Playing.
 Software Engineering.
 Robot locomotion(Movement from One place to
another place).
 Etc.
 There are several algorithms for
machine learning.

1. Decision Tree Algorithm.


2. Bayesian Classification Algorithm.
3. Shortest Path Calculation Algorithm.
4. Neural Network Algorithm.
5. Genetic Algorithm.
1. Decision Tree Algorithm:
 It is used in statistics data mining and
machine learning uses a decision tree as
a predictive model which maps
observation about an item to conclusion
about the item’s target value.
 The goal is to create a model that
predicts the value of a target variable on
several input variables.
2. Bayesian Classification:

 Bayesian classifiers are statistical


classifiers.They can predict class
membership probabilities,such as the
probability that a given tupple belongs to
a particular class.
 This classification is based on Bayesian
theorem.
3.Neural Network Algorithm:
 An artificial neural network is a
mathematical or computational model
that is inspired by the structure and
function aspects of biological neural
network. A neural network consist of an
interconnected group of artificial neurons
and it processes information using a
connectionist approach to computation.
1.Decision Tree Induction:
 During the late 1970s J.Ross Quinlan, a researcher in
machine learning,developed a decision tree algorithm
known as ID3(Iterative Dichotomiser).Quinlan later
presented C4.5,which became a benchmark to which newer
supervised learning algorithms are often compared.In 1984
a group of statisticians published the book classificatio and
regression trees(CART),which describs the generation of
binary decision trees.
 ID3,C4.5 and CART adopt a greedy(i.e., nonbacktracking)
approach in which decision tres are constructed in a top
down recursive divide and conquer manner.
 Inputs:
 Data partition D,Which is a set of training tuples and their
associated class labels.
 Attribute_list,the set of candidate attributes.
 Attribute_selection_method a procedure to determine the
splitting criterio that “best” partitions the data tuples into
individual classes.This criterion consists of splitting_attribute
and possibly,either a split point or splitting subset.
 Output: A decision tree.
 Method:
1. Create a node N;
2. If tuples in D are all of the same class,C then returns N as a leaf
node labeled with the class C;
3. If Attribute_list is empty then return N as a leaf node labeled
with the majority class in D;
4. Apply Attribute_selection_method to find “best”
splitting_criterion;
5. Label node N with splitting_criterion;
6. If splitting_attribute is discrete_valued and multiway splits
allowed then attribute_list=attribute_list – splitting_attribute.
7. For each outcome j of splitting_criterion
8. Let Dj be the set of data tuples in D satisfying outcome j;
9. If Dj is empty then
10. Attach a leaf labeled with the majority class D to node N;
11. Else attach the node returned by Generate_decision_tree to
node N;endfor
12.Return N;
 Attribute Selection Measures:
 An attribute selection measure is a experience
based techniques for selecting the splitting
criterion that “best” separates a given data
partition D,of class-labeled training tuples into
individual classes.If we were to split D into
smaller partitions according to the outcomes of
the splitting criterion,ideally each partition would
be pure.(i.e.,all of the tuples that fall into a given
partition would belong to the same class.)

 There are main three measures for it.


1. Information Gain.
2. Gain Ratio.
3. Gini Index.
Example:
Age Income Student Credit_Rating Class:Buy_Computer
Young high no fair no

Young high no excellent no

Middle high no fair yes

Senior medium yes fair yes

Senior low yes excellent no

Middle medium no fair yes

Senior medium no excellent no


1. Information Gain:
 ID3 uses information gain as its attribute
selection measure.The measure is based on
pioneering work by Claude Shannon on
information theory,which studied the value or
information content of messages.

Info(D)= -∑ Pi log(Pi) (where i=1 to m)


Info A (D)=∑((|Dj| / |D|)*Info(Dj))
(where j=1 to v)
Gain(A)= Info(D) – Info A (D)
 In Example, class buy_computer has two distinct value {yes,no}. So m=2.
 Let class C1 correspond to yes and C2 correspond to no.
 Here total tuples with “yes” are 3 and with “no” are 4. Total=4+3=7

so Info(D)=-(3/7)Log(3/7) –
(4/7)Log(4/7)

=0.9851
 Here for young=2,middle=2,senior=3.among young both are from “no” class.
And among middle both are from “yes” class.and among senior 1 is from “yes”
and 2 are from “no” class.
so Info age (D)=((2/7)*(-2/2 Log(2/2) – 0/2 Log(0/2)))+
((2/7)*(-2/2 Log(2/2) - 0/2 Log(0/2)))+
((3/7)*(-1/3 Log(1/3) - 2/3 Log(2/3)))
=0.05931

 So Gain(age)=0.9851-0.05931
=0.9257
 As we calculated ,gain for age, we have to calculate gain for all attribute.After
calculating gain ,attribute which has highest gain value ,becomes our split
node.
AGE

Young
Senior
Middle

Income Income
Student Student
Credit_rating Income Credit_rating
Class:Buy or not Student Class:Buy or not
Credit_rating
Class:Buy or Not
2. Gain Ratio:
 The information gain measure is biased toward tests with many
outcomes.That is,it prefers to select attributes having a large number of
values.For example,consider an attribute that acts as a unique identifier,such
as a product_ID.It would give large no. of partitions. So Info product_ID
(D)=0. so it is useless to calculate information gain.

splitInfo A(D)= -∑((|Dj|/|D| * Log (|Dj|//|D|))


Gain Ratio= Gain (A) / SplitInfo(A)

For our example 2 tuple for young,2 for middle and 3 for senior
so splitInfo age(D)=-2/7 log(2/7) – 2/7 log(2/7) -3/7 log(3/7)
= 1.5564

For age gain(age)=0.9257


So gain ratio =0.9257/1.5564=0.5947

Attribute, which has maximum gain ratio is selected for split node.
3. Gini Index:
 The gini index used in CART. The gini index measures the
impurity of D,
Gini(D)= 1- ∑Pi*Pi (where i=1 to m)

 The gini index considers a binary split for each attribute.


 If we have v possible values then we have a 2^v possible
subsets.
 For example for income we have
{low,medium,high},{low,medium},{low,high},{medium,high}
,{low},{high},{medium},{}. But we have to consider only
2^v-2 values.
Gini A(D)=((|D1|/|D|)Gini(D1) + (|D2|/|D|)Gini(D2))

and Gini(A) = Gini(D) – Gini A(D)

 The attribute which gives minimum gini index is considered


as a split node. Because it has lowest impurity.
 After calculation, of selection measure we split our
decision tree through split node which we decide through
any of the selection measures.
 The process will continue until we get a all tuple from
same class.
 So decision tree algorithm is implemented like above.
Reference:
 Data mining: Concepts and Techniques
-By: Jiawei Han
:Micheline Kamber
 Machine Learning-By Tom Mitchell
 www.wikipedia.org

You might also like