You are on page 1of 3

defining the problem - in quantity terms - quantifiable components

- customer analysis
- revenue analysis
- spends analysis

Techniques to achieve above:


-Data mining
-lapsers(consulting component)

- hypothesis testing for analysis for segmentation between female and male
audience.(statistical approach)

-data descriptives

-fisher testing

-structure data and


data warehouse :
rows : observations/records
columns : fields/variables/lables

- Categorical values
- ANUVA
- MANUVA
- supervised (regression is the example)
- unsupervised(clustering is example)

-conjoint analysis
-Machine Learning Techniques : divide data into training and testing.
x axis - features.
y axis - label
observations are cases.

Time series modelling

recommendations and collabrating filtering


--------------------------------------
28th April:

Probablity :
1. Marginal
2. Joint
3. Conditional

CD | NO-CD |
AC 20 | 50 | 70
NON-AC |20 | 10 | 30
|40 | 60 | 100

Mutually exclusive - not dependeant on eah other


Independant - propotion same then they are independant

Ques on families and buyer of the car

a)marginl - 80/200
b)joint - 42/200
c)conditional -
Baye's theorem:

P(B|A) = P(A|B)P(B)/P(A|B)P(B) + P(A|B')P(B')

Probabilty distribution:
roll = function(){
die=1:6
dicetotal = sample(die, size=1) + sample(die,size=1)
dicetotal
}

roll()

results = replicate(1000,roll())
t1=table(results)
t1
barplot(t1)

Binomial Distribution(discrete distribution):


P = probablity of a good pen
P-1 = probablity of a bad pen

P P P (1-P)(1-P) = P^3 (1-P)^2 or P^3 (1-P)^5-3 = 5C3

=BINOM.DIST(A4,7,0.6,0)
Binom

Through Excel :
x P(x)
0 0
1 0.0172032
2 0.0774144
3 0.193536
4 0.290304
5 0.2612736
6 0.1306368
7 0.0279936

Through R:
dbinom(1,7,0.6)

x=0:7
prob<-dbinom(x,7,0.6)
data.frame(x,prob)

Poisson Distribution (discrete distribution):

formula-
P(x) = e^-lamda * lamda^2/x!
lamda= average number of events
e=2.71

In R:
#Poisson
dpois(4,3)
1 - ppois(3,3)
Normal distriution(continous distribution) :

Normal Probabilty Density Function of normal distribution.

EXAMPLE PROBLEM :

a) =NORM.DIST(0.28,0.295,0.025,1) = 0.2742531
mean=0.295
sd=0.025
pnorm(0.28,mean,sd)

b)=1-NORM.DIST(0.35,0.295,0.025,1) = 0.013903448
1-pnorm(0.35,mean,sd)
pnorm(0.35,mean,sd,lower.tail=FALSE)

c)pnorm(0.34,mean,sd) - pnorm(0.26,mean,sd)

myluggage<-read.csv("Luggage.csv",header = TRUE)
View(myluggage)

head(myluggage)
attach(myluggage)
?t.test
?attach
t.test(WingA,WingB,var.equal = TRUE,alternative = "two.sided",conf.level = 0.95)

concreteData<-read.csv("Concrete1.csv")
attach(concreteData)
t.test(TwoDays,SevenDays,paired = TRUE,alternative = "less",conf.level = 0.99)

t.test , parametrs, confidence level, null

Chi-Square analysis Basics

Chi-sqaure Test Goodness of Fit.

Reject Zone is always right side.

command in R : pchisq

You might also like