Professional Documents
Culture Documents
> 50
samples?
Yes No
Predicting a Get more
Category? data
Yes No
Labeled Predicting a
data? Quantity?
Yes No No Yes
Just
Classification Clustering looking? Regression
problem problem No Yes problem
Producing
structure
Dimensionality
reduction problem
Tough
luck
Predictive modeling
Predictive modeling is a commonly used term to represent all
statistical techniques that predict future behavior.
Solutions of Predictive modeling are a form of data-mining
technology that works by analyzing historical and current data
and generating a model to help predict future outcomes from
same/new data.
Simply put, predictive analytics uses past trends and applies
them to future.
For example, if a customer purchases a smart phone from a e-
commerce website,
he might be interested in it’s accessories immediately,
He might be a potential customer for phone battery a few years down the line.
Currently, chances of him buying accessory of a competitor smartphone are
relatively bleak.
Data extraction
&
transformation
Predictive Business
modeling
+ Understanding
What are the moves to travel from to given that you can swap
an adjacent tile with
blank space in a move?
Regression Analysis
Regression Analysis
Regression analysis helps one understand how the typical value of the dependent
variable (or 'criterion variable') changes when any one of the independent variables
is varied, while the other independent variables are held fixed.
Regression analysis estimates the conditional expectation of the dependent variable
given the independent variables – that is, the average value of the dependent
variable when the independent variables are fixed.
In all regression analysis problems, a function of the independent variables called
the regression function is to be estimated.
We will predict y given the input x and the goal of the linear
regression learning algorithm is to find the values for the coefficients
B0 and B1.
Sales Accounts
The marketing manager of Habib Bank Marketing representative
calls opened
Limited has a large marketing force at his MUJAHID HUSSAIN 96 41
TALHA MUHAMMAD 40 41
office and wants to determine whether ABDUL MALIK 104 51
there is a relationship between the number AHMAD AYUB 128 60
AHMED NAWAZ KHAN 164 61
of calls made to potential customers in a HAMZA AYOUB 76 29
month and the number of new accounts HASSAN IFTIKHAR 72 39
IBRAHIM AHMAD 80 50
opened during the month. INSHA WAMIQ 36 28
Sales Accounts
The marketing manager of Habib Bank Marketing representative
calls opened
Limited has a large marketing force at his MUJAHID HUSSAIN 96 41
TALHA MUHAMMAD 40 41
office and wants to determine whether ABDUL MALIK 104 51
there is a relationship between the number AHMAD AYUB 128 60
AHMED NAWAZ KHAN 164 61
of calls made to potential customers in a HAMZA AYOUB 76 29
month and the number of new accounts HASSAN IFTIKHAR 72 39
IBRAHIM AHMAD 80 50
opened during the month. INSHA WAMIQ 36 28
An algorithm belonging to
Linear algorithms.
Another famous
algorithm from Linear
algorithms is Perceptron.
LDA consists of statistical
properties of your data,
calculated for each class.
For a single input variable
this includes:
The mean value for each
class.
The variance calculated
across all classes.
Lecture 18 – Predictive modelling
CE451 Applied Artificial Intelligence
If KNN give good results on your dataset, try using LVQ to reduce the
memory requirements of storing the entire training dataset.
Best results are achieved if you rescale your data to have the same
range, such as between 0 and 1.
The distance between the hyperplane and the closest data points is
referred to as the margin.
The best or optimal hyperplane that can separate the two classes is
the line that has the largest margin.
Only these points are relevant in defining the hyperplane and in the
construction of the classifier.
These points are called the support vectors.
They support or define the hyperplane.
In practice, an optimization algorithm is used to find the values for the
coefficients that maximizes the margin.
SVM might be one of the most powerful out-of-the-box classifiers and
worth trying on your dataset.
Lecture 18 – Predictive modelling
CE451 Applied Artificial Intelligence
Clustering - Example
I was teaching CS101 course to Civil Student Name
AALIYAN AHMED KHAN
St. Number
1
Total Marks
82,675
Grade
Clustering - Example
One strategy is to sort the students by Student Name
AALIYAN AHMED KHAN
St. Number
1
Total Marks
82,675 A
Grade
5
65,125
63,75
A-
B+
60
MUHAMMAD ABDULLAH 23 51,625 C+
MUHAMMAD AMMAR 24 51,375 C+
50
MUHAMMAD DAANIAL KHAN 25 51,125 C+
40
MUHAMMAD IBRAHIM 26 50,85 C+
MUHAMMAD KHIZAR 27 50,375 C
MUHAMMAD NAVEED ZAFAR 28 50,375 C
30
MUHAMMAD USMAN 29 50,025 C
SARA FATIMA KAZI 30 49,575 C
20
SHAHZAIB HAIDER 31 49 C-
SYED JAHANZAIB BUKHARI 32 41,175 D+
10 TAYYABA JAVED 33 39,775 D
VARUN MALANI 34 39,175 D
0 YUNEEB AHMAD 35 37,3 F
0 10 20 30 40 ZAIN ASHFAQ 36 44,575 D+
STUDENT ID
Lecture 18 – Predictive modelling
CE451 Applied Artificial Intelligence
Dimensionality reduction
It can be divided into two categories based on the type of approach used.
feature selection and
feature extraction.
Lecture 18 – Predictive modelling
CE451 Applied Artificial Intelligence
Dimensionality reduction
Support Centroid-
Decision Non-linear Ensemble Spectral
Vector based
trees regression regressor embedding
Machines clustering
Density-
Linear Kernel Random
Boosting based Lasso
classifiers approximation forests
clustering
Self- Non-linear
Kernel
Isomap Elastic net organizing dimensionality K-means
estimation
maps reduction
Spectral Logistic
Clustering regression
> 50
samples?
Yes No
Labeled Predicting a
data? Quantity?
Yes No No Yes
Just
Classification Clustering looking? Regression
problem problem No Yes problem
Producing
structure Dimensionality
reduction
problem
Tough
luck