You are on page 1of 32

Statistics

It is a branch of mathematics used to summarize,


analyze & interpret a group of numbers of
observations.

Types of Statistics
Descriptive Statistics :

It summarize data to make sense or meaning of a


list of numeric values.
Inferential Statistics :
It is used to infer or generalize observations made
with samples to the larger population from which
they were selected. Broadly it is classified into
theory of estimation and testing of hypothesis

Estimation & Testing of


Hypothesis
Estimation
The method to estimate the value of a
population parameter from the value of the
corresponding sample statistic.
Testing of Hypothesis
A claim or belief about an unknown parameter
value.

Types of Estimation
Point estimation
It is the value of sample statistic that is used
to estimate most likely value of the unknown
population parameter.
Interval estimation
It is the range of the values that is likely to
have population parameter value with a
specified level of confidence.

Properties of estimation
ConsistencyThe statistic tend to become closer to population
parameter as the sample size increases.

UnbiasednessE(Statistic) = Parameter

EfficiencyRefers to the size of the standard error(SE). E.g., SE


of sample median is greater than the sample mean,
So the sample mean is more efficient .

SufficiencyRefers to the usage of sample information by the


statistic. E.g., Sample mean is more sufficient than
sample median because usage is more.

Drawback of point
estimation
No information is available regarding its
reliability,i.e, how close it is to its true
population parameter.
In fact, the probability that a single sample
statistic actually equals to the population
parameter is extremely small

Interval Estimation
Confidence Interval= Point estimate margin 0f error

Margin of error = critical value of Z or t at 90%,


95%
& so on confidence level*
standard
error of particular statistic

Estimation
Population mean Avg. salary
Population proportion Stock Market

Interval Estimation for population


mean()
SAMPLE SIZE

FORMULAE

Large Sample(n30)
Known SD()

Unknown SD()

Sample Mean square(S)

x Z
x Z

S
n

1
xx

n 1

Interval Estimation for population


mean()
SAMPLE SIZE

FORMULAE

Small Sample(n<30)
Known SD()

Unknown SD()

Sample Mean square(S)

x Z
x t

S
n

1
xx

n 1

Interval Estimation for


population proportion(p)

p p Z
2

p(1 p )
n

Test of hypothesis
Hypothesis
Statements
about
characteristics
populations, denoted as H.

Types of Hypothesis

Null & Alternative hypothesis


Simple & Composite hypothesis

of

Hypothesis Testing
Null Hypothesis-

The hypothesis actually tested is called the null


hypothesis. It is denoted as H0. It is the claim that is
initially assumed to be true. It may usually be
considered the skeptics hypothesis: Nothing new or
interesting happening here!

Alternative Hypothesis-

The other hypothesis, assumed true if the null is false,


is the alternative hypothesis. It is denoted as H1 or Ha .
Ha may usually be considered the researchers
hypothesis.
These two hypotheses are mutually exclusive and
exhaustive so that one is true to the exclusion of the
other.

Hypothesis Testing
Simple Hypothesis It specifies the distribution completely (One tail test)
H 0 : 1 = 2
HA: 1 > or < 2

Composite hypothesis-

It does not specifies the distribution completely(Two


tail test)
H0: 1 = 2
HA: 1 2

Examples of Hypothesis :
There exists a positive relationship between
attendance and result.
Bankers assumed high-income earners are

Rules for Hypotheses


H0 is always stated as an equality claim
involving parameters.
Ha is an inequality claim that contradicts H0.
It may be one-sided (using either > or <) or
two-sided (using ).
A test of hypotheses is a method for using
sample data to decide whether the null
hypothesis should be rejected.
Rejection region - Values of the test
statistic for which we reject the null in favor
of the alternative hypothesis

Test Procedure
A test procedure is specified by
1. A test statistic, a function of the
sample data on which the decision is
to be based.
2. (Sometimes, not always!) A rejection
region, the set of all test statistic
values for which H0 will be rejected

Hypothesis Testing
Test Result
True State
H0 True
H0 False

H0 True

H0 False

Correct
Decision

Type I Error

Type II Error

Correct
Decision

P (Type I Error ) P(Type II Error )


Goal: Keep , reasonably small

Errors in Hypothesis Testing


A type I error consists of rejecting the
null hypothesis H0 when it was true.
A type II error consists of not rejecting H0
when H0 is false.

and

are the probabilities of type I


and type II error, respectively.

Level

Test

Sometimes, the experimenter will fix the


value
of
, also known as the significance level.
A test corresponding to the significance
level is called
test. A test with
a level
significance level
is one for which the
type I error probability
is controlled at
the specified level.

Steps in Hypothesis-Testing Analysis


1. Set up H0 and Ha
2. Identify the nature of the sampling distribution
curve and specify the appropriate test statistic
3. Determine whether the hypothesis test is onetailed or two-tailed
4. Taking into account the specified significance
level, determine the critical value (two critical
values for a two-tailed test) for the test statistic
from the appropriate statistical table
5. State the decision rule for rejecting H0
6. Compute the value for the test statistic from
the sample data
7. Using the decision rule specified in step 5,
either reject H0 or reject Ha

Large sample test(Z-test)


TEST FOR SINGLE MEAN
A cinema hall has cool Drinks fountain supplying
orange & colas. When the machine is turned
on ,it fills a 550ml cup with 5ooml of the
required drink. The manager has 3 problems.
1.The clients have been complaining that the machine supplies less
than 5ooml.
2. The manager wants to make sure that the amount of cola does
not exceed 500ml.
3. The manager want to minimize customer complaint & at the
same time does not want any overflow.

In the case of cinema hall, suppose n= 36,


sample mean= 499ml & the specifications of
the machine give the standard deviation of the
output as 1 ml. The significance level is 10%.

Large sample test(Z-test)


Test for difference mean
Business Today has conducted a survey
between Sonepur & Muzaffarpur on the
hourly wages of laborers.
Results of the
survey are as follows.
Town
Sonepur
Muzaffarpur

Mean Hourly Wages


Rs.8.95
Rs.9.10

S.D
Rs.0.40
Rs.0.60

Sample
200
175

Business Today wants to test the hypothesis


at the 0.05 significance level that there is no
difference between hourly wages for the
landless laborers in the two towns.

Large sample test(Z-test)


TEST FOR PROPORTION
A cable TV operator claims that 40% of the
homes in a city have opted for his services.
Before sponsoring advertisements on the local
cable channel, a company conducted a survey &
found that 250 out of 550 persons were found to
have cable TV services from the operator . On
the basis of this data can we accept the claim of
the cable TV operator at 1% level of
significance?

Q1. An ambulance service claims that it takes, on the


average 8.9 minutes to reach its destination in
emergency calls . To check on this claim, the
agency which licenses ambulance services has
them timed on 50 emergency calls, getting a mean
of 9.3 minutes with a standard deviation of 1.8
minutes. Test the claim at 1% level of significance?
Q2. An automobile company decided to introduce a
new car whose mean petrol consumption is claimed
to be lower than that of the existing auto engine. It
was found that the mean petrol consumption for 50
cars was 10 km per litre with a standard deviation
3.5 km per litre. Test for the company at 5% level of
significance, the claim that in the new car petrol
consumption is 9.5 km per litre on the averge.

Q3. Two types of new cars produced in India are

tested for petrol mileage. One group


consisting of 36 cars averaged 14 kms per
litre. While the other group consisting of 72
cars averaged 12.5 kms per litre.
i) What test statistic is appropriate if standard

deviation of petrol cosumption per litre for both


cars are 1.5 and 2.0 respectively?
ii)Test whether there exists a significant difference
in petrol consumption of those two types of cars
at 1% level of significance?

Large sample test(Z-test)


Single Mean

X 0

Difference
Mean

S/ n
Proportion

p p0

X1 X 2

12 22

n1 n2

p0 1 p0 / n

Large sample test(Z-test)


Single Mean

x1 x 2
Z

Proportion

z
Difference
Mean

p0 1 p0 / n

( x1 x 2 ) ( 1 2 )

12 22

n1 n2

p p0

Large sample test(Z-test)

Critical values of Z
Level of
10%
significance(
)

5%

1%

Critical values
for two-tailed
test

1.64

1.96

2.58

Critical values
for left-tailed
test

-1.28

-1.64

-2.33

Critical values
for right-tailed

1.28

1.64

2.33

Small sample test(t-test)


TEST FOR SINGLE MEAN
The average breaking strength of steel rods
is specified to be 18.5 thousand kg. For this
a sample of 14 rods was tested . The mean
& standard deviation obtained were 17.85
and 1.955 respectively. Test at 5% level of
the significance of the deviation.

Small sample test(t-test)


TEST FOR DIFFERENCE MEAN
The average life of sample of 10 electric
light bulbs was found to be 1456 hours
with standard deviation of 423 hours. A
second sample of 17 bulbs chosen from a
different batch showed a mean life of 1280
hours with standard deviation of 398
hours. Is there a significant difference
between the means of two batches. Test at
5% level of the significance.

Small sample test(t-test)


PAIRED SAMPLES
The HRD manager wishes to see if there has been any
change in the ability of trainers after a specific
program. The trainees take an test before the start of
the program and an equivalent one after they have
completed it . The scores recorded are given below.
Has any change taken place at 5% level of the
significance.
Trainee :
A B C
Score before training: 75 70 46
77
Score after training: 70 77 57
76

D E
F G H I
68 68 43 55 68
60 79 64 55 77

Small sample test(t-test)


Single Mean

Difference
Mean

x x ( )
t

x
t
S
n

( x x)

S x1 x 2

1 1
S x1 x 2 S

n1 n2

n 1

n1 1 s1 n2 1 s2
n1 n2 2
2

Small sample test(t-test)


Paired t-test

d d
t
Sd

Sd

(d d )

n 1

Where d is the difference between the


d
scores and
is the mean of the
difference between paired observations

You might also like