Decision Trees. These Models Use Observations About Certain

machine learning
Machine learning (ML) is a category of algorithm that allows software

applications to become more accurate in predicting outcomes without
being explicitly programmed. The basic premise of machine learning is to
build algorithms that can receive input data and use statistical analysis to
predict an output while updating outputs as new data becomes available.
The processes involved in machine learning are similar to that of data
mining and predictive modeling. Both require searching through data to
look for patterns and adjusting program actions accordingly. Many people
are familiar with machine learning from shopping on the internet and
being served ads related to their purchase. This happens
because recommendation enginesuse machine learning to personalize
online ad delivery in almost real time. Beyond personalized marketing,
other common machine learning use cases include fraud detection, spam
filtering, network security threat detection, predictive maintenance and
building news feeds.
Types of machine learning algorithms
Just as there are nearly limitless uses of machine learning, there is no

shortage of machine learning algorithms. They range from the fairly
simple to the highly complex. Here are a few of the most commonly used
models:
 This class of machine learning algorithm involves identifying a

correlation -- generally between two variables -- and using that
correlation to make predictions about future data points.
 Decision trees. These models use observations about certain
actions and identify an optimal path for arriving at a desired outcome.
 K-means clustering. This model groups a specified number of
data points into a specific number of groupings based on like
characteristics.
 Neural networks. These deep learning models utilize large
amounts of training data to identify correlations between many
variables to learn to process incoming data in the future.
How machine learning works
Machine learning algorithms are often categorized
as supervised or unsupervised. Supervised algorithms require a data
scientist or data analyst with machine learning skills to provide both input
and desired output, in addition to furnishing feedback about the accuracy
of predictions during algorithm training. Data scientists determine which
variables, or features, the model should analyze and use to develop
predictions. Once training is complete, the algorithm will apply what was
learned to new data.
Unsupervised algorithms do not need to be trained with desired outcome

data. Instead, they use an iterative approach called deep learning to
review data and arrive at conclusions. Unsupervised learning algorithms
-- also called neural networks -- are used for more complex processing
tasks than supervised learning systems, including image recognition,
speech-to-text and natural language generation. These neural networks
work by combing through millions of examples of training data and
automatically identifying often subtle correlations between many
variables. Once trained, the algorithm can use its bank of associations to
interpret new data. These algorithms have only become feasible in the age
of big data, as they require massive amounts of training data.
probability
Probability is a branch of mathematics that deals with calculating the
likelihood of a given event's occurrence, which is expressed as a
number between 1 and 0. An event with a probability of 1 can be
considered a certainty: for example, the probability of a coin toss
resulting in either "heads" or "tails" is 1, because there are no other
options, assuming the coin lands flat. An event with a probability of .5
can be considered to have equal odds of occurring or not occurring:
for example, the probability of a coin toss resulting in "heads" is .5,
because the toss is equally as likely to result in "tails." An event with a
probability of 0 can be considered an impossibility: for example, the
probability that the coin will land (flat) without either side facing up is 0,
because either "heads" or "tails" must be facing up. A little paradoxical,
probability theory applies precise calculations to quantify uncertain
measures of random events.
In its simplest form, probability can be expressed mathematically as:

the number of occurrences of a targeted event divided by the number
of occurrences plus the number of failures of occurrences (this adds
up to the total of possible outcomes):
p(a) = p(a)/[p(a) + p(b)]
clustering
clustering is the grouping of a particular set of objects

based on their characteristics, aggregating them
according to their similarities. Regarding to data mining,
this metodology partitions the data implementing a
specific join algorithm, most suitable for the desired
information analysis.
This clustering analysis allows an object not to be part of

a cluster, or strictly belong to it, calling this type of
grouping hard partitioning. In the other hand, soft
partitioning states that every object belongs to a cluster
in a determined degree. More specific divisions can be
possible to create like objects belonging to multiple
clusters, to force an object to participate in only one
cluster or even construct hierarchical trees on group
relationships.
There are several different ways to implement this

partitioning, based on distinct models. Distinct
algorithms are applied to each model, diferentiating it’s
properties and results. These models are distinguished
by their organization and type of relantionship between
them. The most important ones are:
- Centralized - each cluster is represented by a single

vector mean, and a object value is compared to these
mean values
- Distributed – the cluster is built using statistical
distributions
- Connectivity – he connectivity on these models is
based on a distance function between elements
- Group – algorithms have only group information
- Graph – cluster organization and relationship between
members is defined by a graph linked structure
- Density – members of the cluster are grouped by
regions where observations are dense and similar
Introduction to K-means Clustering
K-means clustering is a type of unsupervised learning,

which is used when you have unlabeled data (i.e., data
without defined categories or groups). The goal of this
algorithm is to find groups in the data, with the number of
groups represented by the variable K. The algorithm works
iteratively to assign each data point to one of K groups
based on the features that are provided. Data points are
clustered based on feature similarity. The results of
the K-means clustering algorithm are:
1. The centroids of the K clusters, which can be used to
label new data
2. Labels for the training data (each data point is assigned

to a single cluster)
Rather than defining groups before looking at the data,

clustering allows you to find and analyze the groups that
have formed organically. The "Choosing K" section below
describes how the number of groups can be determined.
Each centroid of a cluster is a collection of feature values

which define the resulting groups. Examining the centroid
feature weights can be used to qualitatively interpret what
kind of group each cluster represents.
This introduction to the K-means clustering algorithm

covers:
 Common business cases where K-means is used
 The steps involved in running the algorithm
 A Python example using delivery fleet data
Naive Bayes‘
You are working on a classification problem and you have generated your
set of hypothesis, created features and discussed the importance of
variables. Within an hour, stakeholders want to see the first cut of the
model.
What will you do? You have hunderds of thousands of data points and quite
a few variables in your training data set. In such situation, if I were at your
place, I would have used ‘Naive Bayes‘, which can be extremely fast relative
to other classification algorithms. It works on Bayes theorem of probability
to predict the class of unknown data set.
In this article, I’ll explain the basics of this algorithm, so that next time
when you come across large data sets, you can bring this algorithm to
action. In addition, if you are a newbie in Python or R, you should be
overwhelmed by the presence of available codes in this article.
Above,
 P(c|x) is the posterior probability of class (c, target)

given predictor (x, attributes).
 P(c) is the prior probability of class.
 P(x|c) is the likelihood which is the probability
of predictor given class.
 P(x) is the prior probability of predictor.

Decision Trees. These Models Use Observations About Certain

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Decision Trees. These Models Use Observations About Certain

Uploaded by

Copyright:

Available Formats

machine learning

Machine learning (ML) is a category of algorithm that allows software

Types of machine learning algorithms

Just as there are nearly limitless uses of machine learning, there is no

 This class of machine learning algorithm involves identifying a

Unsupervised algorithms do not need to be trained with desired outcome

In its simplest form, probability can be expressed mathematically as:

p(a) = p(a)/[p(a) + p(b)]

clustering is the grouping of a particular set of objects

This clustering analysis allows an object not to be part of

There are several different ways to implement this

- Centralized - each cluster is represented by a single

Introduction to K-means Clustering

K-means clustering is a type of unsupervised learning,

2. Labels for the training data (each data point is assigned

Rather than defining groups before looking at the data,

Each centroid of a cluster is a collection of feature values

This introduction to the K-means clustering algorithm

 Common business cases where K-means is used

 The steps involved in running the algorithm

 A Python example using delivery fleet data

 P(c|x) is the posterior probability of class (c, target)

You might also like