You are on page 1of 9

1 Population and sample

Data
sample data

The Central Limit Theorem


- distribution should always be Normal

- Higher the sample size lower the error

2 Hypothesis

data which we get by taking sample data is it true or not in population data. To find out this we use hypothesis

- Null- which we don't want in case of 1-tail test null- always take equal to
- Alternate- which we want alternate-always take not equal to

Normal distribution will always be a continuous distribution

probability of a point is always 0 in continouos distribution

Confidence range
Confindence Level %

inv you have the probability to calculate variable


norm.dist you have the variable to calculate probability

Continous distr. Will always have a range


descret will always have a number
t this we use hypothesis

in case of 2-tail test


e not equal to
x 35
mean 40 N(x<35) 0.16% =NORMDIST(B1,B2,B3/SQRT(B4),TRUE)
SD 12 99.84% =1-E2
n 50

z -2.94628 =(B1-B2)/(B3/SQRT(B4))

N(z<-2.95) 0.16%

99% 43.94794

standard

Mean of the sample 20000


mean of the population 19000
sd 4000
n 100 the more confindent you want to be. The more you will be away from cen
Confidence level(C.L.) 95% Confidence level is always an assumption
z 1.644854 =NORM.S.INV(D21) we use this formulla to calculate confidence level
Critical value(xcritical) 19657.94 =D18+D19/SQRT(D20)*D22

if you want to launch this course then you have to be 95% confindent
decision changes as your confidence level changes

z- Sample 2.5 =(D17-D18)/(D19/SQRT(D20))

Q. what is the probability that your average will be less than 20,000(<20000)

1 confindence level
2 significance level =1-confidence level p-value 5% p-value is significance level

mean of the sample 20000 z critical 1 (ZC1) -1.95996 =NORM.S.INV(D46)


mean of the population 19000 ZC2 1.959964 =NORM.S.INV(D47)
sd 4000
sample size (n) 100 2.5 =(D40-D41)/(D42/SQRT(D43))
c.L. 95%
P1 2.5% =(1-D44)/2
P2 97.50% =D44+D46
probability 99.38%
=NORM.DIST(D17,D18,D19/SQRT(D20),TRUE)

we can calculate probability form z also

Probability 99.38%

more you will be away from center


n assumption
te confidence level

e 95% confindent

p-value is significance level If p value is greater than significance level then you should accept it
If p value is less than significance level then you should not accept it

1)/(D42/SQRT(D43))
they will buy only if it is equal to 19000
WHAT's the use?

we want to predict something

we need some data to predict


1st lec we studied how to important data

how to find out is it a normal distribution or other


by looking at the data

yes/NO it's a descret variable

Descret we can count


continuous any value we can not count infinite possibilities

Descret
uniform
bionormal yes/no type answers logistic regression
banouli

u should know what kind of distribution is there

Age it's a continuous variable


but we treated as a descret variable

In continuous
normal distribution
- bale curve
- it has mean in the center
- you know everything about this distribution

standard normal distribution to make it simpler they’ve created this


- where mean is zero and standard deviation is one

x normal distribution u need mean and


z standard normal distribution

if z is 2 then you are 2 sd away from center

The Central Limit Theorem


- all possible sample mean will follow normal distribution
- mean of all possible mean is equal to possible mean
-
Hypothesis
- we get some results on the basis of sample data
so we use it to find out whether this result is true for population or not

whenever there is > and < sign then it is a one tail test

if the critical value is greater than alternative the alternate is true otherwise null is true

If "P" value is less than Significance Level then H0 is not true


and if "P" greater than significance level then H0 is true

thenever you have less than 30 data point then it follows T distribution
it's center is always zero

it has patter tails


it is defined by only one variable and that is N-1

degree of freedom
sigma

You might also like